Hey good morning. I read some articles about Pinot, and feel Pinot can be a great tool for our real time analytics platform. we currently use snowflake and redshift.
I tried it with a simple usecase (63 million records with percentileest, avg aggration) on a single ec2 instance and the performance is amazing.
I want to pursue further and have a few questions related with building star tree index for the aggregators. mainly we want to make sure building the star tree indexes takes much shorter than the full cubing. hope you can help me out:
1. for our percentile aggregation, we only care the values for 10, 25, 50, 75 and 90 percent. is there any way to do the aggregation only for those percentiles?
2. How do i know if a star tree index is built? from the UI “Reload Status” screen, I don’t see anything related with the star tree index
3. currently we are doing very intensive monthly cubing to support realtime analytics (percentile on 12 columns, avg on 12 columns, approx_cont_distinct on 5 columns). at the end of each month, we are batch feeding about 70 million records. is it possible to build the star tree index in a couple of hours? if so what are the recommended ways to speed up the index building process?
some context for our table:
1. 40 dimension columns, 1 time column and 15 metric column
2. we have monthly feed about 70 million records
3. we need monthly, quarterly and yearly analytics
Thanks in advance