Hi everyone, we have realtime tables and data inge...
# troubleshooting
s
Hi everyone, we have realtime tables and data ingestion is happening from kafka, but our query performance is very low even with around in total we have 13 lkhs of data. query time is 17 secs, we have 1 tenant, 1 broker 2 server. Do we need to create indexes separately or it is done by default on columns because i saw some indexes are created. Also is there a option that we can create segments as per kafka - topic key. We are usually doing query on timestamp based and id , and our kafka topics have id as key.
m
do you have query response metadata you can share?
s
"numServersQueried": 2, "numServersResponded": 2, "numSegmentsQueried": 65, "numSegmentsProcessed": 18, "numSegmentsMatched": 13, "numConsumingSegmentsQueried": 10, "numDocsScanned": 10603, "numEntriesScannedInFilter": 391632, "numEntriesScannedPostFilter": 137839, "numGroupsLimitReached": false, "totalDocs": 1220116, "timeUsedMs": 87, "offlineThreadCpuTimeNs": 0, "realtimeThreadCpuTimeNs": 0, "offlineSystemActivitiesCpuTimeNs": 0, "realtimeSystemActivitiesCpuTimeNs": 0, "offlineResponseSerializationCpuTimeNs": 0, "realtimeResponseSerializationCpuTimeNs": 0, "offlineTotalCpuTimeNs": 0, "realtimeTotalCpuTimeNs": 0, "segmentStatistics": [], "traceInfo": {}, "numRowsResultSet": 5000, "minConsumingFreshnessTimeMs": 1649763344790
m
Ok, one thing I can see is that too much data is being scanned
391632
, so you probably need to have some indexing setup
But 17s is too much. So the next questions a) What is the query b) what is the cpu/mem for servers c) How many segments
Wait
"timeUsedMs": 87,
this is the time Pinot used to compute the query
Where are you seeing 17s?
If client side, then my guess is that the response is big and your JSON deser is the bottleneck
s
yes I do have json fields and can you please tell me is there a way we can add all data related to one key in same segment. Or in other words can we create segments as per kafka topic key.
m
No I am not talking about json fields. I am saying your query took 87ms and not 17s
s
yes Mayank from pinot query console it is taking some ms but from sqlachemy in our python app it is taking 15 secs when we are not doing any other things just querying the data.
m
That would be a sqlalchemy issue. My guess is it is spending time in deserializing the response.
What is the Pinot client you are using?
s
pinotdb and sqlalchemy
Mayank thank you for your support , it seems my local network messed up with VPN , I run same code in server and it is giving result in ms
👍 1