I have a Pinot cluster running in kubernetes under docker de Apache Pinot #getting-started

I have a Pinot cluster running in kubernetes under...

Michael Latta

06/22/2022, 4:00 AM

I have a Pinot cluster running in kubernetes under docker desktop. I managed to load a schema and table definition to ingest from Kafka. It loaded the first 100,000 records and stopped. The UI does not appear to have any way to display errors, logs, etc. I have looked at all the pod logs (controller, broker, minion, and 2 servers), but do not see anything obvious. I could use some assistance debugging this. There do not appear to be separate ingest "tasks" that I can identify (like for druid).

Kishore G

06/22/2022, 6:23 AM

please check the server logs.. most likely the segment generation failed.. 100k is the threshold at which real-time segments get flushed.. my guess is you have something wrong with time column configuration or the data contains bad values for time column

Michael Latta

06/22/2022, 2:30 PM

_java.lang.IllegalArgumentException_: Invalid format: "2022-06-03T100720.200844Z" is malformed at "T100720.200844Z"

Michael Latta

06/22/2022, 2:31 PM

Good input. I did ultimately fix the format to use 'T for the time separator. How do I reset the table, or do I need to delete and recreate? Does this mean that real time tables fail with any row of bad data? Is there a way to skip bad records?

Kishore G

06/22/2022, 3:00 PM

it does skip bad data but if it fails to parse the time column and or make sense of it.. it fails, a lot of functionality - retention, pruning, routing etc depends on this metadata

Michael Latta

06/22/2022, 3:03 PM

I deleted the table and created a new one and do not see errors in the log, but no records are being consumed. I did publish a few records in case the offsets were retained. The following is the end of the log

Michael Latta

06/22/2022, 3:04 PM

Looks like I can not paste log here.

Michael Latta

06/22/2022, 3:38 PM

UI populates the table config with:

Michael Latta

06/22/2022, 3:38 PM

"nullHandlingEnabled": false, "streamConfigs": { "streamType": "kafka", "stream.kafka.topic.name": "", "stream.kafka.broker.list": "", "stream.kafka.consumer.type": "lowlevel", "stream.kafka.consumer.prop.auto.offset.reset": "smallest", "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory", "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder", "realtime.segment.flush.threshold.rows": "0", "realtime.segment.flush.threshold.time": "24h", "realtime.segment.flush.segment.size": "100M" }

Michael Latta

06/22/2022, 3:38 PM

Log has these messages:

Michael Latta

06/22/2022, 3:38 PM

The configuration 'realtime.segment.flush.threshold.rows' was supplied but isn't a known config. The configuration 'stream.kafka.decoder.class.name' was supplied but isn't a known config. The configuration 'streamType' was supplied but isn't a known config. The configuration 'realtime.segment.flush.segment.size' was supplied but isn't a known config. The configuration 'stream.kafka.consumer.type' was supplied but isn't a known config. The configuration 'stream.kafka.broker.list' was supplied but isn't a known config. The configuration 'realtime.segment.flush.threshold.time' was supplied but isn't a known config. The configuration 'stream.kafka.consumer.prop.auto.offset.reset' was supplied but isn't a known config. The configuration 'stream.kafka.consumer.factory.class.name' was supplied but isn't a known config. The configuration 'stream.kafka.topic.name' was supplied but isn't a known config.

Michael Latta

06/22/2022, 3:39 PM

I wonder if the UI is out of date?

Neha Pawar

06/22/2022, 3:52 PM

when you say no records are being consumed, do you mean you cannot see results in the query console?

Neha Pawar

06/22/2022, 3:53 PM

are segment states in the UI GOOD? if not, can you try doing a “resetSegments” API from swagger, and then see if anything comes up in the logs?

Michael Latta

06/22/2022, 4:07 PM

Segment state is BAD in UI. I see a reload under /segments/table but no resetSegments

Michael Latta

06/22/2022, 4:09 PM

i found /segments/table/reset it reports no segments

Michael Latta

06/22/2022, 4:10 PM

It is running now, fixed a typeo.

Michael Latta

06/22/2022, 4:11 PM

Copy code

{
  "code": 500,
  "error": "Failed to reset segments in table: scale_REALTIME. Timed out waiting for external view to stabilize after call to disable/reset segments. Disable/reset might complete in the background, but skipping enable of segments of table: scale_REALTIME"
}

Michael Latta

06/22/2022, 4:31 PM

I rebooted the server pods and back to date ingest issues. I am going to rebuild the topic to make sure the data is good.

Michael Latta

06/22/2022, 4:50 PM

I rebooted the server pods and fixed my time column spec and have 800k records ingested. So past the first attempt.

Michael Latta

06/22/2022, 4:50 PM

@Kishore G Thank you for your help.

Open in Slack

Previous Next