https://pinot.apache.org/ logo
t

troywinter

05/22/2021, 4:01 AM
I got the following exception when trying to consume from kafka using high level consumer, I checked the configured
stream.kafka.hlc.bootstrap.server
,
stream.kafka.hlc.zk.connect.string
,
stream.kafka.zk.broker.url
is correct, but it’s not consuming.
Copy code
2021/05/21 13:59:59.025 WARN [PinotRealtimeSegmentManager] [<http://ZkClient-EventThread-104-mse-93c7add0-zk.mse.aliyuncs.com:2181|ZkClient-EventThread-104-mse-93c7add0-zk.mse.aliyuncs.com:2181>] Caught exception while processing segment fetrace_biz2-pinot_0__0__1621605599020 for instance Server_pinot-server-2.pinot-server-headless.pinot.svc.cluster.local_8098, skipping.
java.lang.NullPointerException: null
        at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.assignRealtimeSegmentsToServerInstancesIfNecessary(PinotRealtimeSegmentManager.java:247) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274]
        at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.processPropertyStoreChange(PinotRealtimeSegmentManager.java:304) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274]
        at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.handleDataChange(PinotRealtimeSegmentManager.java:405) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274]
        at org.apache.helix.manager.zk.zookeeper.ZkClient$7.run(ZkClient.java:1039) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274]
        at org.apache.helix.manager.zk.zookeeper.ZkEventThread.run(ZkEventThread.java:69) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-255202ec4fc7df2283f7c275d8e9025a26cf3274]
m

Mayank

05/22/2021, 4:11 AM
Curious why you want to use high level consumer?
t

troywinter

05/22/2021, 4:13 AM
Because low level consumer will create one segment per partition, this will cause many small segments for small datasources, so I think high level consumer will be a better fit?
m

Mayank

05/22/2021, 4:15 AM
HLC is not scalable, it has the reverse problem where every node has to consume all
For small data source why do you have too many partitions?
t

troywinter

05/22/2021, 4:18 AM
We have a uniformed partition number for all topics, we don’t know for sure what the traffic will be before creating that topic.
m

Mayank

05/22/2021, 4:19 AM
Hmm. You can still use low level and have segment open for longer to avoid too many small segments
t

troywinter

05/22/2021, 4:20 AM
I see, thanks.