I am new and trying to see if pinot is suitable fo...
# general
m
I am new and trying to see if pinot is suitable for my use case. I have a requirement where I need different indexes for different teams to support different query pattern on the same table. I am planning to use offline table. is there a way I can generate the segments once for the tables with different indexing pattern?
k
You can upload same segment to different tables and have different indexing config for each.
But why not not have one table and add all the indexes you need on same table
m
My knowledge on pinot is limited so validating. My problem is different teams have different query patterns, data retention and concurrency requirement. As a central data team, want to publish the data for them to consume, at the same time, we want to allow them to self serve by adding right indexes, define retention etc. that makes sense for their use case. We don’t want to build indexes or tune the tables for them. What is the best approach here?
Nothing has been implemented yet... we are still ideating, so any idea is welcome.😀
k
Okay. Got it. What is the source of data Stream or batch
m
It’s batch. Every 6 hours.
We are thinking of running spark jobs to generate the segments and write to S3. Each consuming pinot cluster would read from S3.
Assumption here is each cluster/ tenant might have a slightly different indexing on the table.
m
Depending on the data size and read-load you might choose to have a single table with indexing on all superset of columns, or have your clients separate custom variations.
Both approaches have pros and cons. We can advise further if you can provide more details