Hi All, I am new to Pinot and got some basic quest...
# general
m
Hi All, I am new to Pinot and got some basic questions. When comparing with ElasticSearch do Pinot creates similar large scale index? Can I configure Pinot to talk to S3 or other data stores directly without provisioning additional space for local Indices? if in case local index is mandatory how big it will be? Thanks.
m
At the moment, local index is needed. There's a PR in progress for using S3/deepstore as primary storage, and caching in data locally as needed.
m
Thanks @Mayank, Great to know that a PR is going on . Meanwhile is there a possibility to restrict the local index to store say 12 hours of data and rotate it? and fall back to S3 for other queries even though there will be a latency?
c
Pinot does support data retention. However for query federation, you'll have to do additional work today. One way to do it is using Presto + Pinot on realtime data (0-12 hours) and Presto + S3 for everything else. Of course, the federation is not quite seamless (i.e. same query cannot span across both today).
m
@Chinmay Soman This looks like a possible soln. Do we have to expose two different urls/ui for this or somehow use superset to provide single interface to the user?
c
With either Presto or superset, you don't need 2 different URLs. However internally they'll be treated as different catalogs or databases
so the queries have to explicitly mention that. I'm not sure if there's a good way to hide it
m
Thanks @Chinmay Soman , will start looking in this direction. I assume I can leverage superset custom viz to do that. Not sure I can do the same with presto.
c
With Presto, all you do is add 2 different catalogs
and the other one for S3
m
Thank you, I anyways require presto for S3 integration so using it is definitely a good approach. Do Pinot have any helm charts?
c
yes
m
Perfect..! seems I got everything to crack in.
c
awesome ! do let us know how it goes. This is not new btw, at Uber we ran Presto + Pinot & HDFS in a federated manner
🙌 1
that was insanely useful since data scientists could query both pinot and Hive data in the same query
m
will do , also the chart seems to have presto and superset as well. Thats a good thinking from whoever designed it.
👍 1
c
I think it was @Xiang Fu 🙂
👍 1