What happens if I create 2 tables (different schem...
# getting-started
s
What happens if I create 2 tables (different scheme/table def) to consume events from the same kafka topic? Will both tables get all events, or will the events be consumed once per pinot deployment, rather than once per table?
m
both tables will get all evnets
s
great, that is what i had expected to happen
So what I am seeing is that for some reason my table has stop consuming the events. The other odd thing I see is that for some reason my table has a 0 for reported and estimated size (in the UI), even though there is definitely data
I dont' have any quota set (unless it can be set at the tenant level and then inherited by the tables?)
m
do you see any errors on the segments? Any messages in the server logs?
s
The segments report Status = 'Good' (green), if i click the segment some are 'Consuming', some are 'Offline' (although the server itself is running)
We found some segmentStoppedConsuming errors in the log to investigate
m
I dunno why it would have stopped consuming, but you can try to kickstart it by calling the pause and then resume APIs. https://startree.ai/blog/apache-pinot-0-11-pausing-real-time-ingestion
oh ok
s
ok, thats handy to know about however, ill try that next time
s
When segments are OFFLINE it means the server had some difficulty consuming data. Could be decoding error, could be other exceptions. These will be logged on the server side. Pinot will attempt to restart consumption every once in while (probably that is why you are seeing GREEN).
n
@Simon J you can also try the
debug
endpoint to see if there are any consumption failures.
s
@Subbu Subramaniam Perhaps it should be orange if it is having some troubles? Green is a bit misleading as it means that you don't delve deeper to investigate
@Navina Thanks, I hadn't used that end-point before, it has been very useful today!
s
@Simon J perhaps the color needs to be fixed, but can you first confirm that this is the case? I think you had mentioned that there were no logs in the servers, so I want to first find out what the problem is
s
@Subbu Subramaniam There were logs on the servers themselves, unfortunately I only have access to the logs in ELK stack however as I do not run the pinot installation. In this instance however unfortunately the file beat (log tailer) had crashed so was not shipping the logs i needed to ELK. When someone looked on the box we learnt that segment had stopped consuming. The debug endpoint (which I learnt about since then) has been helpful as I can see that I have a NumberFormatException for a timestamp string that is trying to parse into a Long. Strange as I don't recall anyone changing the data type of a field, however at this stage it looks that has happened. Is there a way to determine the kafka offset that causes consumption to stop? That way I could easily find the event to see what the cause is
s
Hmm... Not sure. I think the log message prints the kafka offset that causes it to stop, but I could be wrong. Also, you mention that you dont have a way to look at logs . Is that permanent, or a temporary thing (because of the crash you mention). If temporary, then you can wait until the next time the segment tries to consume, and look for the log i there. Alternatively, you may need a Kafka consumer (in LinkedIN we have a command line tool ) that can hep you here -- but this iwll work only if you have some reasonable number of messages. You can be searching forever if your kafka partitions produce messages at 50k/sec 🙂
A better way may be to check the producer, if you have control over it, to see if there are some bugs there. Get a sample of rows output. In many cases, it may not be just one row that is the problem - it may be that many (or all) rows produced have a problem.
s
I hope that the file beat crashing is not a recurring event... Will the segment auto retry after stopping like this, or could I just force it to re-try the same event by enable/disable the table? (we produce around 1k events/s)
we do control both producer and consumer so fixing whatever has caused the error should be easy enough once we can identify what the data issue is
s
It will keep retrying, and (afaik) if the offset it tries has expired in Kafka, then it will move forward to a more recent one.
If you think the error is a one-off, you can also try dropping and recreating the table. It will start consuning from the latest offset of course.
s
Yes, I was wondering about how you go about recovering from the situation where a Pinot cannot process a particular record on the the kafka stream (maybe due to a bug the has caused it to be corrupt), and whether you can tell Pinot to move on an ignore that record.
s
As of now, we don't have any such mechanism.