I am new to Pinot I am trying to understand if Pinot query i Apache Pinot #general

I am new to Pinot. I am trying to understand if Pi...

ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ

09/27/2021, 3:43 PM

I am new to Pinot. I am trying to understand if Pinot query in a say, Java client (https://docs.pinot.apache.org/users/clients/java) can be made to work similar to KStream example (https://kafka.apache.org/28/documentation/streams/tutorial). That is, the KStream example does not "loop" to look for new messages to apply transformations. Whereas, it is not clear to me if the Pinot example will keep running until stopped. My scenario is as follows. For every new message that arrives, I want a batch of records between

[current_timestamp - 1 minute, current_timestamp]

where,

current_timestamp

corresponds to most recently arrived message. Can a Pinot query client be written to run as soon as a new message arrives? Thanks.

Kenny Bastani

09/27/2021, 7:46 PM

Sure. You can subscribe to the topic as a Kafka consumer in your Java application using the Pinot Java Client. You'll need to deal with a potential race condition of whether or not Pinot has ingested the record into a real-time table yet. In this case, you can issue a query to check if the record exists yet in Pinot, and if it does not, create an async thread on a scheduled executor that retries on a periodic interval until that record is ingested. From there you can then execute the query that aggregates over your window.

Kenny Bastani

09/27/2021, 7:46 PM

Does that make sense?

ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ

09/28/2021, 1:12 PM

Yes, it does sound similar to what I had thought. I guess, the Kafka Streams is "lesser work". So, I will use Pinot for longer term data. Thanks.

👍 1

Ashwin

10/01/2021, 9:55 PM

I have heard that for Java applications, Flink is better than Kafka. Is this true?

Open in Slack

Previous Next