Query regarding Apache Pinot, whats the typical OL...
# general
d
Query regarding Apache Pinot, whats the typical OLAP cube size one can host in Apache Pinot, we have a cube which is almost 50 TB m it has some dimensions which very high cardinality but since raw data is more than 5 Petabytes, 50-100 TB is still a reasonable aggregation. We want interactive performance with our OLAP since it would power important Dashboards and drill-downs. So want to know how much data size we can push inside Apache Pinot??
m
With Pinot, you don't need to precompute all the cubes upfront (at write time). You can pre-aggregate your raw data, and the cubing will happen at read time based on the query.
If you are saying your single cube (which you will further drill-down on) is 50 TB, then Pinot scales horizontally, so theoretically there isn't a limit on data size. However, depending on your use case, you may need to enable some optimizations (eg partitioning). If you can share a bit more on your use case, we can help with figuring out to solve it using Pinot.
d
I see we already have cube deployed, but it persisted in S3 with Presto in front
We expect it to grow to around 150 TB in 1 year or so and then it would stabalise
but we are not getting interactive performance and for that we are looking at Apache Pinot
m
Let's move to #troubleshooting
d
Mostly sub-second latency. Our current latency with Presto and S3 is around 10+ seconds given the volume
sure