https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • m

    Mayank

    05/11/2020, 5:18 PM
    ^^ states broker side latency is 185ms
  • s

    Shounak Kulkarni

    05/11/2020, 5:18 PM
    query on prometheus: pinot_broker_queryExecution_75thPercentile
  • m

    Mayank

    05/11/2020, 5:18 PM
    what's your qps?
  • s

    Shounak Kulkarni

    05/11/2020, 5:19 PM
    2k qps
  • m

    Mayank

    05/11/2020, 5:19 PM
    Also is the broker log from the problematic query? Seems like broker for this query was 185ms
  • m

    Mayank

    05/11/2020, 5:19 PM
    Perhaps there's other queries that have issues?
  • s

    Shounak Kulkarni

    05/11/2020, 5:20 PM
    ya as i mentioned above prometheus numbers were different hence we thought it as broker issue
  • s

    Shounak Kulkarni

    05/11/2020, 5:20 PM
    server query time is same at prometheus and logs
  • m

    Mayank

    05/11/2020, 5:22 PM
    Broker time is always >= server time. So if you see issue on server side, there should be issue on broker side
  • s

    srisudha

    05/11/2020, 5:23 PM
    Right @Mayank.. We overlooked that part .. We realized problem is at server end.
  • s

    srisudha

    05/11/2020, 5:24 PM
    Server was actually responding well .. We will need to look at what is causing this on server..
  • d

    Dan Hill

    05/17/2020, 6:02 PM
    Are there good resources for running Pinot in production? Semi-related, while running Pinot locally, the broker randomly died and all of the data on it was lost. How do I handle this in production? In production, I'll have replicas. How do I start my servers back up but not serve from them until they are healthy (until data is populated again)?
  • m

    Mayank

    05/17/2020, 6:05 PM
    Those servers will be taken out from External view, and broker will stop routing queries to them. When the server restarts, segments will start coming online and broker will be notified view external view change and update routing table accordingly
  • d

    Dan Hill

    05/17/2020, 6:10 PM
    Locally, I waited 15 minutes and the data didn't show up. The broker came back. I tried populating the local data again and no data is showing.
  • k

    Kishore G

    05/17/2020, 6:10 PM
    There is no data on brokers - they are stateless
  • k

    Kishore G

    05/17/2020, 6:11 PM
    are you talking about brokers or servers
  • d

    Dan Hill

    05/17/2020, 6:13 PM
    My brokers restarted. I'm not sure how to get access to the older logs. The previous log gave me this:
    Copy code
    2020/05/17 17:27:06.694 INFO [HelixBrokerStarter] [main] Starting Pinot broker
    2020/05/17 17:27:06.980 INFO [HelixBrokerStarter] [main] Connecting spectator Helix manager
    2020/05/17 17:27:13.836 INFO [HelixBrokerStarter] [Thread-1] Shutting down Pinot broker
    2020/05/17 17:27:14.266 INFO [HelixBrokerStarter] [Thread-1] Disconnecting participant Helix manager
    Exception in thread "Thread-1" java.lang.NullPointerException
    	at org.apache.pinot.broker.broker.helix.HelixBrokerStarter.shutdown(HelixBrokerStarter.java:300)
    	at org.apache.pinot.tools.admin.command.StartBrokerCommand.cleanup(StartBrokerCommand.java:80)
    	at org.apache.pinot.tools.AbstractBaseCommand$1.run(AbstractBaseCommand.java:34)
  • d

    Dan Hill

    05/17/2020, 6:13 PM
    When I looked at pinot server, it looks like it's having issues talking to zookeeper locally.
  • d

    Dan Hill

    05/17/2020, 6:13 PM
    Zookeeper doesn't show any issues.
  • d

    Dan Hill

    05/17/2020, 6:13 PM
    I've been having random failures with docker-for-desktop.
  • d

    Dan Hill

    05/17/2020, 6:16 PM
    From my pinot-server-0. zookeeper logs are fine.
  • d

    Dan Hill

    05/17/2020, 6:16 PM
    Copy code
    2020/05/17 18:05:30.678 ERROR [ZKHelixManager] [ZkClient-EventThread-14-pinot-zookeeper:2181] fail to connect zkserver: pinot-zookeeper:2181 in 60000ms. expiredSessionId: 1000253abb00017, clusterName: pinot-quickstart
    2020/05/17 18:05:34.917 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15011ms for sessionid 0x0
    2020/05/17 18:05:51.034 WARN [ClientCnxn] [main-SendThread(198.105.254.23:2181)] Client session timed out, have not heard from server in 16016ms for sessionid 0x0
    2020/05/17 18:06:06.116 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15016ms for sessionid 0x0
    2020/05/17 18:06:22.234 WARN [ClientCnxn] [main-SendThread(198.105.254.23:2181)] Client session timed out, have not heard from server in 16016ms for sessionid 0x0
    2020/05/17 18:06:30.679 ERROR [ZKHelixManager] [ZkClient-EventThread-14-pinot-zookeeper:2181] fail to connect zkserver: pinot-zookeeper:2181 in 60000ms. expiredSessionId: 1000253abb00017, clusterName: pinot-quickstart
    2020/05/17 18:06:37.314 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15013ms for sessionid 0x0
    2020/05/17 18:06:41.735 WARN [ClientCnxn] [main-SendThread(198.105.254.23:2181)] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
    <http://java.net|java.net>.ConnectException: Connection refused
    	at <http://sun.nio.ch|sun.nio.ch>.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_252]
    	at <http://sun.nio.ch|sun.nio.ch>.SocketChannelImpl.finishConnect(SocketChannelImpl.java:714) ~[?:1.8.0_252]
    	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) ~[pinot-all-0.4.0-SNAPSHOT-jar-with-dependencies.jar:0.4.0-SNAPSHOT-d7808ac5acaf8a4b933cf7564cc334f09fa2404e]
    	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1144) [pinot-all-0.4.0-SNAPSHOT-jar-with-dependencies.jar:0.4.0-SNAPSHOT-d7808ac5acaf8a4b933cf7564cc334f09fa2404e]
    2020/05/17 18:06:56.957 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15015ms for sessionid 0x0
    2020/05/17 18:06:58.383 WARN [ClientCnxn] [main-SendThread(198.105.254.23:2181)] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
    <http://java.net|java.net>.ConnectException: Connection refused
    	at <http://sun.nio.ch|sun.nio.ch>.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_252]
    	at <http://sun.nio.ch|sun.nio.ch>.SocketChannelImpl.finishConnect(SocketChannelImpl.java:714) ~[?:1.8.0_252]
    	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) ~[pinot-all-0.4.0-SNAPSHOT-jar-with-dependencies.jar:0.4.0-SNAPSHOT-d7808ac5acaf8a4b933cf7564cc334f09fa2404e]
    	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1144) [pinot-all-0.4.0-SNAPSHOT-jar-with-dependencies.jar:0.4.0-SNAPSHOT-d7808ac5acaf8a4b933cf7564cc334f09fa2404e]
    2020/05/17 18:07:13.515 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15014ms for sessionid 0x0
    2020/05/17 18:07:29.607 WARN [ClientCnxn] [main-SendThread(198.105.254.23:2181)] Client session timed out, have not heard from server in 16022ms for sessionid 0x0
    2020/05/17 18:07:30.680 ERROR [ZKHelixManager] [ZkClient-EventThread-14-pinot-zookeeper:2181] fail to connect zkserver: pinot-zookeeper:2181 in 60000ms. expiredSessionId: 1000253abb00017, clusterName: pinot-quickstart
    2020/05/17 18:07:44.224 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.hlc.zk.connect.string' was supplied but isn't a known config.
    2020/05/17 18:07:44.224 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'realtime.segment.flush.threshold.size' was supplied but isn't a known config.
    2020/05/17 18:07:44.224 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.decoder.class.name' was supplied but isn't a known config.
    2020/05/17 18:07:44.224 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'streamType' was supplied but isn't a known config.
    2020/05/17 18:07:44.224 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.consumer.type' was supplied but isn't a known config.
    2020/05/17 18:07:44.225 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.zk.broker.url' was supplied but isn't a known config.
    2020/05/17 18:07:44.225 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.broker.list' was supplied but isn't a known config.
    2020/05/17 18:07:44.225 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'realtime.segment.flush.threshold.time' was supplied but isn't a known config.
    2020/05/17 18:07:44.225 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.consumer.prop.auto.offset.reset' was supplied but isn't a known config.
    2020/05/17 18:07:44.225 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.consumer.factory.class.name' was supplied but isn't a known config.
    2020/05/17 18:07:44.225 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.topic.name' was supplied but isn't a known config.
    2020/05/17 18:07:44.724 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15015ms for sessionid 0x0
    2020/05/17 18:08:00.806 WARN [ClientCnxn] [main-SendThread(198.105.254.23:2181)] Client session timed out, have not heard from server in 16014ms for sessionid 0x0
    2020/05/17 18:08:15.923 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15016ms for sessionid 0x0
    2020/05/17 18:08:30.681 ERROR [ZKHelixManager] [ZkClient-EventThread-14-pinot-zookeeper:2181] fail to connect zkserver: pinot-zookeeper:2181 in 60000ms. expiredSessionId: 1000253abb00017, clusterName: pinot-quickstart
    2020/05/17 18:08:32.008 WARN [ClientCnxn] [main-SendThread(198.105.254.23:2181)] Client session timed out, have not heard from server in 16018ms for sessionid 0x0
    2020/05/17 18:08:47.119 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15010ms for sessionid 0x0
    2020/05/17 18:09:03.202 WARN [ClientCnxn] [main-SendThread(198.105.254.23:2181)] Client session timed out, have not heard from server in 16016ms for sessionid 0x0
    2020/05/17 18:09:18.320 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15016ms for sessionid 0x0
    2020/05/17 18:09:30.682 ERROR [ZKHelixManager] [ZkClient-EventThread-14-pinot-zookeeper:2181] fail to connect zkserver: pinot-zookeeper:2181 in 60000ms. expiredSessionId: 1000253abb00017, clusterName: pinot-quickstart
    2020/05/17 18:09:34.400 WARN [ClientCnxn] [main-SendThread(198.105.254.23:2181)] Client session timed out, have not heard from server in 16013ms for sessionid 0x0
    2020/05/17 18:09:49.516 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15015ms for sessionid 0x0
    2020/05/17 18:10:05.598 WARN [ClientCnxn] [main-SendThread(198.105.254.23:2181)] Client session timed out, have not heard from server in 16015ms for sessionid 0x0
    2020/05/17 18:10:20.710 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15011ms for sessionid 0x0
    2020/05/17 18:10:30.683 ERROR [ZKHelixManager] [ZkClient-EventThread-14-pinot-zookeeper:2181] fail to connect zkserver: pinot-zookeeper:2181 in 60000ms. expiredSessionId: 1000253abb00017, clusterName: pinot-quickstart
    2020/05/17 18:10:36.793 WARN [ClientCnxn] [main-SendThread(198.105.254.23:2181)] Client session timed out, have not heard from server in 16016ms for sessionid 0x0
    2020/05/17 18:10:47.942 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.hlc.zk.connect.string' was supplied but isn't a known config.
    2020/05/17 18:10:47.943 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'realtime.segment.flush.threshold.size' was supplied but isn't a known config.
    2020/05/17 18:10:47.943 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.decoder.class.name' was supplied but isn't a known config.
    2020/05/17 18:10:47.943 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'streamType' was supplied but isn't a known config.
    2020/05/17 18:10:47.943 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.consumer.type' was supplied but isn't a known config.
    2020/05/17 18:10:47.943 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.zk.broker.url' was supplied but isn't a known config.
    2020/05/17 18:10:47.943 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.broker.list' was supplied but isn't a known config.
    2020/05/17 18:10:47.943 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'realtime.segment.flush.threshold.time' was supplied but isn't a known config.
    2020/05/17 18:10:47.943 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.consumer.prop.auto.offset.reset' was supplied but isn't a known config.
    2020/05/17 18:10:47.943 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.consumer.factory.class.name' was supplied but isn't a known config.
    2020/05/17 18:10:47.944 WARN [ConsumerConfig] [events__0__1__20200516T0028Z] The configuration 'stream.kafka.topic.name' was supplied but isn't a known config.
    2020/05/17 18:10:51.911 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15016ms for sessionid 0x0
    2020/05/17 18:11:07.990 WARN [ClientCnxn] [main-SendThread(198.105.254.23:2181)] Client session timed out, have not heard from server in 16013ms for sessionid 0x0
    2020/05/17 18:11:23.107 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15016ms for sessionid 0x0
    2020/05/17 18:11:30.684 ERROR [ZKHelixManager] [ZkClient-EventThread-14-pinot-zookeeper:2181] fail to connect zkserver: pinot-zookeeper:2181 in 60000ms. expiredSessionId: 1000253abb00017, clusterName: pinot-quickstart
    2020/05/17 18:11:39.190 WARN [ClientCnxn] [main-SendThread(198.105.254.23:2181)] Client session timed out, have not heard from server in 16015ms for sessionid 0x0
    2020/05/17 18:11:54.308 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15013ms for sessionid 0x0
    2020/05/17 18:12:10.384 WARN [ClientCnxn] [main-SendThread(198.105.254.23:2181)] Client session timed out, have not heard from server in 16014ms for sessionid 0x0
    2020/05/17 18:12:25.500 WARN [ClientCnxn] [main-SendThread(198.105.244.23:2181)] Client session timed out, have not heard from server in 15015ms for sessionid 0x0
    2020/05/17 18:12:30.686 ERROR [ZKHelixManager] [ZkClient-EventThread-14-pinot-zookeeper:2181] fail to connect zkserver: pinot-zookeeper:2181 in 60000ms. expiredSessionId: 1000253abb00017, clusterName: pinot-quickstart
  • k

    Kishore G

    05/17/2020, 6:19 PM
    is this on your desktop?
  • d

    Dan Hill

    05/17/2020, 6:20 PM
    Yea
  • d

    Dan Hill

    05/17/2020, 6:20 PM
    It was working about ~40 minutes ago
  • d

    Dan Hill

    05/17/2020, 6:20 PM
    I'm going to restart docker-for-desktop and see if it fixes it.
  • k

    Kishore G

    05/17/2020, 6:23 PM
    yeah, we have noticed this if the system gets overloaded
  • k

    Kishore G

    05/17/2020, 6:24 PM
    in distributed setup this happens only when there is network issue or Full GC
  • k

    Kishore G

    05/17/2020, 6:25 PM
    regarding data, only servers host data, brokers are stateless
  • d

    Dan Hill

    05/17/2020, 6:26 PM
    Gotcha. Cool
1...101102103...166Latest