Hey everyone! I’m working on creating a realtime p...
# troubleshooting
f
Hey everyone! I’m working on creating a realtime pinot table that is based on messages published on kafka topic. My goal is to see messages pushed to kafka in pinot table as fast as it is possible. Here’s my current table config:
Copy code
"tableIndexConfig": {
    "loadMode": "MMAP",
    "streamConfigs": {
      "streamType": "kafka",
      "stream.kafka.consumer.type": "simple",
      "stream.kafka.topic.name": "bb8_logs",
      "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
      "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
      "stream.kafka.zk.broker.url": "zookeeper:2181/kafka",
      "stream.kafka.broker.list": "kafka:9092",
      "realtime.segment.flush.threshold.rows": "0",
      "realtime.segment.flush.threshold.time": "1h",
      "realtime.segment.flush.desired.size": "50M",
      "stream.kafka.consumer.prop.auto.offset.reset": "smallest"
    }
  }
Wanted to ask you what should be changed to see kafka topic messages in pinot table as fast as it is possible?
s
use "stream.kafka.consumer.type": "lowlevel",
and give more partitions of the kafkatopic
pinot throughput should increase considerably
if your p[roducer app is writing data on all partitions of the kafka topic from where pinot is consuming
f
How can I increase a number of partitions per topic?
Probably there’s some config I’m missing
s
on kafka
u can use alter partitions command
Copy code
bin/kafka-topics.sh --zookeeper zk_host:port/chroot --alter --topic my_topic_name 
   --partitions 40
f
But currently I can consume messages from kafka topic right after they’re published (with kafkacat). I think that lag is on a Pinot table side.
s
yeah
f
So number of partitions will improve this?
(I though that kafka setting does not influence how pinot consumes data)
s
yep hopefully
otherwise some other issue
f
yep, your change worked like a charm!
Many thanks😊
🎉 1
s
welcome
s
@Filip Gep increasing the number of partitions in kafka should be done carefully. Pinot has per-partition overheads, so it is not wise to just set the number of partition to (say) 512. Ideally, you want to start with a small number of partitions an dincrease it
kafka does not allow decrease of partitions
@Filip Gep run the realtime prov tool. Try with different number of partitions, and see what the mem usage looks like. And then you can decide on the number ofpartitions. Data is available instantly after it is in kafka. If you are not seeing that, then increasing kafka partitions is not going to do you any good.