https://pinot.apache.org/ logo
y

Yash Agarwal

09/01/2020, 11:33 AM
Hey Team, I am getting intermittent exceptions in CombinePlanNode.
Copy code
Exception processing requestId 137
java.lang.RuntimeException: Caught exception while running CombinePlanNode.
	at org.apache.pinot.core.plan.CombinePlanNode.run(CombinePlanNode.java:149) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at org.apache.pinot.core.plan.InstanceResponsePlanNode.run(InstanceResponsePlanNode.java:33) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at org.apache.pinot.core.plan.GlobalPlanImplV0.execute(GlobalPlanImplV0.java:45) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:221) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at org.apache.pinot.core.query.scheduler.QueryScheduler.processQueryAndSerialize(QueryScheduler.java:155) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at org.apache.pinot.core.query.scheduler.QueryScheduler.lambda$createQueryFutureTask$0(QueryScheduler.java:139) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_265]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_265]
	at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at shaded.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_265]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_265]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
Caused by: java.util.concurrent.TimeoutException
	at java.util.concurrent.FutureTask.get(FutureTask.java:205) ~[?:1.8.0_265]
	at org.apache.pinot.core.plan.CombinePlanNode.run(CombinePlanNode.java:139) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
	... 13 more
Processed requestId=137,table=guestslslitm3years_OFFLINE,segments(queried/processed/matched/consuming)=1058/-1/-1/-1,schedulerWaitMs=0,reqDeserMs=4,totalExecMs=10659,resSerMs=0,totalTimeMs=10663,minConsumingFreshnessMs=-1,broker=Broker_10.59.100.47_8099,numDocsScanned=-1,scanInFilter=-1,scanPostFilter=-1,sched=fcfs
Slow query: request handler processing time: 10663, send response latency: 58, total time to handle request: 10721
Is there a reason why this is happening ? Is there a way we can override the timeout of 10s in CombineNodePlan
d

Dileep Reddy

09/01/2020, 2:05 PM
Copy code
select * from guestslslitm3years where mdse_dept_n = 'MUSIC' limit 10
this also gives same exception
k

Kishore G

09/01/2020, 4:30 PM
which version of Pinot
n

Neha Pawar

09/01/2020, 4:30 PM
can you share the whole stach trace for the select * query? this should not happen for select *
y

Yash Agarwal

09/01/2020, 4:34 PM
Version of Pinot.
Copy code
0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343
d

Dileep Reddy

09/01/2020, 4:34 PM
[
{ “errorCode”: 200, “message”: “QueryExecutionError\njava.lang.RuntimeException Caught exception while running CombinePlanNode.\n\tat org.apache.pinot.core.plan.CombinePlanNode.run(CombinePlanNode.java:149)\n\tat org.apache.pinot.core.plan.InstanceResponsePlanNode.run(InstanceResponsePlanNode.java:33)\n\tat org.apache.pinot.core.plan.GlobalPlanImplV0.execute(GlobalPlanImplV0.java:45)\n\tat org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:221)\n\tat org.apache.pinot.core.query.scheduler.QueryScheduler.processQueryAndSerialize(QueryScheduler.java:155)\n\tat org.apache.pinot.core.query.scheduler.QueryScheduler.lambda$createQueryFutureTask$0(QueryScheduler.java:139)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat shaded.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)\n\tat shaded.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)\n\tat shaded.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)” },
n

Neha Pawar

09/01/2020, 4:35 PM
mainly interested in the line similar to this
Copy code
Processed requestId=137,table=guestslslitm3years_OFFLINE,segments(queried/processed/matched/consuming)=1058/-1/-1/-1,schedulerWaitMs=0,reqDeserMs=4,totalExecMs=10659,resSerMs=0,totalTimeMs=10663,minConsumingFreshnessMs=-1,broker=Broker_10.59.100.47_8099,numDocsScanned=-1,scanInFilter=-1,scanPostFilter=-1,sched=fcfs
Slow query: request handler processing time: 10663, send response latency: 58, total time to handle request: 10721
y

Yash Agarwal

09/01/2020, 4:36 PM
Yup. getting those.
Copy code
requestId=137,table=guestslslitm3years_OFFLINE,timeMs=66279,docs=578323590/19911764930,entries=7355523910/2313294360,segments(queried/processed/matched/consuming/unavailable):10780/2798/2798/0/0,consumingFreshnessTimeMs=0,servers=10/10,groupLimitReached=false,exceptions=3,serverStats=(Server=SubmitDelayMs,ResponseDelayMs,ResponseSize,DeserializationTimeMs);10.59.100.32_O=1,12674,1548,0;10.59.100.33_O=1,11540,1548,0;10.59.100.34_O=1,62365,26063,0;10.59.100.35_O=2,61837,26419,0;10.59.100.36_O=2,66272,26305,1;10.59.100.37_O=2,42907,26356,0;10.59.100.38_O=2,49656,26284,1;10.59.100.39_O=3,10722,1548,0;10.59.100.30_O=3,52299,26296,0;10.59.100.31_O=3,57313,26225,1,query=select mdse_dept_n, mdse_clas_n, mdse_sbcl_n, sum(sales) from guestslslitm3years where acct_yr_i = 2019 and mdse_div_n='BEVERAGES' group by mdse_dept_n, mdse_clas_n, mdse_sbcl_n limit 1000
Copy code
requestId=242,table=guestslslitm3years_OFFLINE,timeMs=17366,docs=40/2837233880,entries=66048/2000,segments(queried/processed/matched/consuming/unavailable):10780/1088/4/0/0,consumingFreshnessTimeMs=0,servers=10/10,groupLimitReached=false,exceptions=9,serverStats=(Server=SubmitDelayMs,ResponseDelayMs,ResponseSize,DeserializationTimeMs);10.59.100.32_O=1,16841,1548,0;10.59.100.33_O=1,17363,5301,0;10.59.100.30_O=1,11020,1548,0;10.59.100.31_O=1,11263,1548,0;10.59.100.36_O=1,16569,1548,0;10.59.100.37_O=2,11188,1548,0;10.59.100.34_O=2,12133,1548,0;10.59.100.35_O=2,10993,1548,0;10.59.100.38_O=2,10862,1548,0;10.59.100.39_O=2,11119,1548,0,query=select * from guestslslitm3years where mdse_dept_n = 'MUSIC' limit 10
this the example of select *
FYI. this is scanning 28 Billion records.