Is there any restriction on using an on demand kin...
# getting-started
r
Is there any restriction on using an on demand kinesis data stream for streaming ingestion. Asking this because the partitions/ shards in on demand kinesis stream scales up and down based on the load. And since the partitions keep on changing, do we have to run the rebalance every now and then- how is this handled by Pinot? Or should one use a provisioned Kinesis stream where we have the number of shards set during its creation?
k
Pinot detects shard changes in kinesis automatically and rebalances the consumers.. this may not work for upsert since we require strict partitionjng.. Having said that, I would not recommend running kinesis on spot instances..
r
Okay. That makes sense. Also its not spot per se. We have been running it on demand to scale the cluster based on traffic but its always available. Provisioned type might result in under utilisation of kinesis data stream and hence we avoided that. Also there is no upsert. But this brings another doubt, even if additional servers are added in this scenario, do we require re-balancing?
k
No need to rebalance…
šŸ‘ 1
It detects shard chances periodically (20 minutes - configurable) and whenever a segment gets completed
r
Could you please point me to documentation for the same- i did check out the kinesis ingestion doc https://docs.pinot.apache.org/basics/data-import/pinot-stream-ingestion/amazon-kinesis but i dont see the parameter you just mentioned.