This message was deleted.
# troubleshooting
s
This message was deleted.
j
Copy code
druid-live-cluster-brokers-0 druid-live-cluster-brokers 2023-06-27T06:02:30,979 WARN [sql[a46fce00-0f0f-4b1c-b736-9498f1616173]] org.apache.druid.sql.http.SqlResource - Failed to handle query: SqlQuery{query='
druid-live-cluster-brokers-0 druid-live-cluster-brokers         SELECT
druid-live-cluster-brokers-0 druid-live-cluster-brokers               count(distinct card_number) as card_count
druid-live-cluster-brokers-0 druid-live-cluster-brokers         from payment
druid-live-cluster-brokers-0 druid-live-cluster-brokers         where  __time >= '2023-06-26T15:02:20.916134' and  __time < '2023-06-27T15:02:20.916134'
druid-live-cluster-brokers-0 druid-live-cluster-brokers         and payment_method_type='CARD'
druid-live-cluster-brokers-0 druid-live-cluster-brokers         ', resultFormat=object, header=false, typesHeader=false, sqlTypesHeader=false, context={sqlTimeZone=Asia/Seoul, sqlQueryId=a46fce00-0f0f-4b1c-b736-9498f1616173, queryId=a46fce00-0f0f-4b1c-b736-9498f1616173}, parameters=[]}
druid-live-cluster-brokers-0 druid-live-cluster-brokers org.apache.druid.query.QueryInterruptedException: Faulty channel in resource pool
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.client.JsonParserIterator.convertException(JsonParserIterator.java:273) ~[druid-server-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.client.JsonParserIterator.init(JsonParserIterator.java:191) ~[druid-server-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.client.JsonParserIterator.hasNext(JsonParserIterator.java:93) ~[druid-server-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.java.util.common.guava.BaseSequence.toYielder(BaseSequence.java:70) ~[druid-core-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.java.util.common.guava.MappedSequence.toYielder(MappedSequence.java:49) ~[druid-core-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.java.util.common.guava.ParallelMergeCombiningSequence$ResultBatch.fromSequence(ParallelMergeCombiningSequence.java:879) ~[druid-core-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.java.util.common.guava.ParallelMergeCombiningSequence$SequenceBatcher.block(ParallelMergeCombiningSequence.java:929) ~[druid-core-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3118) ~[?:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.java.util.common.guava.ParallelMergeCombiningSequence$SequenceBatcher.getBatchYielder(ParallelMergeCombiningSequence.java:918) ~[druid-core-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.java.util.common.guava.ParallelMergeCombiningSequence$YielderBatchedResultsCursor.initialize(ParallelMergeCombiningSequence.java:1025) ~[druid-core-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.java.util.common.guava.ParallelMergeCombiningSequence$PrepareMergeCombineInputsAction.compute(ParallelMergeCombiningSequence.java:732) ~[druid-core-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189) ~[?:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) ~[?:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) ~[?:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) ~[?:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) ~[?:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183) ~[?:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers Caused by: org.jboss.netty.channel.ChannelException: Faulty channel in resource pool
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.java.util.http.client.NettyHttpClient.go(NettyHttpClient.java:134) ~[druid-core-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.client.DirectDruidClient.run(DirectDruidClient.java:456) ~[druid-server-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.client.CachingClusteredClient$SpecificQueryRunnable.getSimpleServerResults(CachingClusteredClient.java:707) ~[druid-server-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.client.CachingClusteredClient$SpecificQueryRunnable.lambda$addSequencesFromServer$8(CachingClusteredClient.java:669) ~[druid-server-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at java.util.TreeMap.forEach(TreeMap.java:1002) ~[?:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.client.CachingClusteredClient$SpecificQueryRunnable.addSequencesFromServer(CachingClusteredClient.java:653) ~[druid-server-25.0.0.jar:25.0.0]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.apache.druid.client.CachingClusteredClient$SpecificQueryRunnable.lambda$run$1(CachingClusteredClient.java:378) ~[druid-server-25.0.0.jar:25.0.0]
....
....
....
druid-live-cluster-brokers-0 druid-live-cluster-brokers Caused by: org.jboss.netty.channel.ConnectTimeoutException: connection timed out: /historical-node-ip:8083
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:139) ~[netty-3.10.6.Final.jar:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83) ~[netty-3.10.6.Final.jar:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) ~[netty-3.10.6.Final.jar:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42) ~[netty-3.10.6.Final.jar:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) ~[netty-3.10.6.Final.jar:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) ~[netty-3.10.6.Final.jar:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
druid-live-cluster-brokers-0 druid-live-cluster-brokers 	at java.lang.Thread.run(Thread.java:829) ~[?:?]
m
Are your historical node up and healthy? do you have logs from historical?
j
Yes, I got some debug logs from Historicals, but there were just related to query information.
Copy code
druid-live-cluster-historicals-1 druid-live-cluster-historicals 2023-06-27T06:02:37,264 DEBUG [qtp1703272304-86[timeseries_[payment]_aaebba34-2c53-4b28-8095-2ae4f6f04436]] org.apache.druid.server.QueryResource - Got query [TimeseriesQuery{dataSource='payment', querySegmentSpec=MultipleSpecificSegmentSpec{descriptors=[SegmentDescriptor{interval=2023-06-26T06:02:37.217Z/2023-06-27T00:00:00.000Z, version='2023-06-26T00:00:00.196Z', partitionNumber=2}, SegmentDescriptor{interval=2023-06-26T06:02:37.217Z/2023-06-27T00:00:00.000Z, version='2023-06-26T00:00:00.196Z', partitionNumber=3}, SegmentDescriptor{interval=2023-06-26T06:02:37.217Z/2023-06-27T00:00:00.000Z, version='2023-06-26T00:00:00.196Z', partitionNumber=4},.......................
and my two historical nodes were healthy, I would say..
That phenomenon happened just for around 15 minutes and it’s okay now.
s
A few thoughts... Were you running a lot of queries concurrently at the time? The exception
Caused by: org.jboss.netty.channel.ConnectTimeoutException: connection timed out: /historical-node-ip:8083
indicates that broker timed out trying to connect to the historical. Perhaps, the jetty thread pool could have been exhausted on the historical, or the historical was otherwise saturated, (e.g. garbage collection, cpu utilization...) could also be a network hiccup.
j
Yeah, I agree. But there were no special things in the monitoring such as GC, CPU, memory, jetty-thread-pool and etc. So, I should check out the network situations, it will be harder to figure out the reason for the symptoms.
s
does this happen repeatedly?