I am new to Pinot. I am trying to understand if Pi...
# general
u
I am new to Pinot. I am trying to understand if Pinot query in a say, Java client (https://docs.pinot.apache.org/users/clients/java) can be made to work similar to KStream example (https://kafka.apache.org/28/documentation/streams/tutorial). That is, the KStream example does not "loop" to look for new messages to apply transformations. Whereas, it is not clear to me if the Pinot example will keep running until stopped. My scenario is as follows. For every new message that arrives, I want a batch of records between
[current_timestamp - 1 minute, current_timestamp]
where,
current_timestamp
corresponds to most recently arrived message. Can a Pinot query client be written to run as soon as a new message arrives? Thanks.
k
Sure. You can subscribe to the topic as a Kafka consumer in your Java application using the Pinot Java Client. You'll need to deal with a potential race condition of whether or not Pinot has ingested the record into a real-time table yet. In this case, you can issue a query to check if the record exists yet in Pinot, and if it does not, create an async thread on a scheduled executor that retries on a periodic interval until that record is ingested. From there you can then execute the query that aggregates over your window.
Does that make sense?
u
Yes, it does sound similar to what I had thought. I guess, the Kafka Streams is "lesser work". So, I will use Pinot for longer term data. Thanks.
👍 1
a
I have heard that for Java applications, Flink is better than Kafka. Is this true?