hi everyone, pinot is showing strange behavior aft...
# troubleshooting
s
hi everyone, pinot is showing strange behavior after adding second broker, it is skipping data 4 times out of 10. It is happening even from pinot query console. Server logs are not showing any errors and because we have 6,7 realtime tables so logs are very quickly filling up and not able to track particular message from logs. Is there any way we can define groupId for consumer in table config.
n
what exactly happens when you say Pinot is skipping data? are you getting results different than what you expected or empty results?
s
no data in pinot table for some messages, while topic has same messages .
n
are all servers ONLINE? and are all segments ONLINE in the table’s external view?
s
Some segments are in BAD state, actually we added 2 days ago 1 more tenant and faced so many issues after adding the one more tenant. Our queries are also not able to run correctly after that sometime giving results and sometimes not so for the times sake we deleted the tenant and broker added for that tenant. Will it causing this issue? Is there a possibility that if topic was deleted and created again segments will go in BAD state?
n
firstly, if segments are in BAD state (indicating that ideal state and external view status is mismatch in zk browser), then that can explain why you dont see some records in Pinot table even though they are in the topic. If segment is OFFLINE/ERROR state, it will not be queried. Restarting the servers or resetting the segments should help. Exceptions will be thrown in server logs, if any issues during restrat/reset. secondly, if the topic for an existing realtime Pinot table was deleted, and then recreated, then it is possible that the table goes into a bad state. If you delete topic, the offsets all change. Pinot maintains its own offsets checkpointing in zookeeper.
this sounds like it is a dev environment. possible to start cleanly?
s
Okaay we can start cleanly though I have deleted BAD segments and tables are in GOOD state now, still should I start again, or it will be fine? Another thing is if this kind of scenario comes in stage or PROD is there a way without starting again we can do something with commands or api's?
n
If your table indeed went bad because of the topic deletion (that's the top suspicion), then we are actively working on a config to help recover from this. I think having a truncate command would've also helped in your case. In general though, deleting topic and recreating, will create unpredictable results, since all offsets change, and should be avoided in prod tables