Hi team, I am working on a project for realtime speed test calculation. I get the speed test data from devices with kafka ingestion. Once they are in Pinot the following calculations need to be performed:
- peak hour 7pm -11 pm data to be selected.
- data comes in different time stamp, the average speed needs to be calculated every hour between 7 pm to 11 pm everyday E.g: 7-8 pm average , 8-9 average, 9-10 average and 10 -11 pm average. ( the average data for every hour should be available as soon as the 1 hour windows is completed )
- 4 average data needs to be stored into another table where we would have 4 sample data points per day.
- from the second table past 14 days data need to be selected and 3rd worse speed should be reported and stored into another table.
Both these two tables would be my reports.
The question is if pinot is the suitable platform to do these sort of calculations ?
What would be the best way to run ETL jobs or tasks for run the query to do the calculations ?
I have already done this with InfluxDB, however I would like to design/implement this with Pinot.
Note that I also have other use cases with the same data where I need the data to be reported on realtime.
Thank you in advance for your help.