xtrntr
08/11/2021, 9:49 PM$ time python3 queries.py
found 6812 userids
$ time python3 queries.py
found 6782 userids
$ time python3 queries.py
found 6895 userids
Mayank
having
clause to check if the result becomes consistent?xtrntr
08/11/2021, 9:52 PMxtrntr
08/11/2021, 9:57 PMhaving
clause gives me deterministic resultsxtrntr
08/11/2021, 9:57 PMlimit x
of the query and do the filtering on the client side thoughxtrntr
08/11/2021, 10:06 PMxtrntr
08/11/2021, 10:06 PMWe can also push certain having clauses to be processed on the server side (instead of on the broker side after merging all the results for each group) to reduce the amount of data sent from server to broker.is this the reason for nondeterminism?
Mayank
Mayank
xtrntr
08/11/2021, 10:10 PMcell between 500 and 550
is reduced to a smaller range, i get deterministic resultsxtrntr
08/11/2021, 10:10 PMxtrntr
08/11/2021, 10:16 PMMayank
limit
in your queryxtrntr
08/11/2021, 10:19 PMxtrntr
08/11/2021, 10:19 PMxtrntr
08/11/2021, 10:21 PMlimit
is applied after group by + having
xtrntr
08/11/2021, 10:21 PMMayank
numDocsScanned
Mayank
1039299
rowsxtrntr
08/11/2021, 10:23 PMxtrntr
08/11/2021, 10:23 PM# with the increased limit of 100k
Processed requestId=90,table=events_OFFLINE,segments(queried/processed/matched/consuming)=198/198/198/-1,schedulerWaitMs=0,reqDeserMs=1,totalExecMs=1331,resSerMs=2,totalTimeMs=1334,minConsumingFreshnessMs=-1,broker=Broker_172.21.0.4_8099,numDocsScanned=3116947,scanInFilter=624231605,scanPostFilter=3116947,sched=fcfs
# with the previous limit of 10k
Processed requestId=91,table=events_OFFLINE,segments(queried/processed/matched/consuming)=198/198/198/-1,schedulerWaitMs=0,reqDeserMs=0,totalExecMs=1159,resSerMs=1,totalTimeMs=1160,minConsumingFreshnessMs=-1,broker=Broker_172.21.0.4_8099,numDocsScanned=3116947,scanInFilter=624231605,scanPostFilter=3116947,sched=fcfs