Hi, not sure if this is the right place, if not pl...
# general
p
Hi, not sure if this is the right place, if not please redirect me. I have a question on how pinot uses deep store, IIUC pinot needs the data to be loaded into OfflineServer (or realtime server) to be able to serve queries corresponding to that segment. Does it pull the segment on-demand from deep store if the segment is not already present, i.e. use deep store as another tier of storage? if not are there any plans on including this feature in any future releases?
k
Pinot servers downloads the segment from the deep store to local before serving the queries.
👍 1
k
Real-time and offline servers are logical designations that are defined by users using tenants that force isolation of segments. By default segments for real-time tables are hosted alongside segments used for batch (offline tables with a job specification). This section of the docs should help. https://docs.pinot.apache.org/basics/components/tenant
k
@Pradeep we do plan to add the support to load the segments lazily - either on demand or LRU based
p
To clarify better, I am trying to estimate the disk space needed for storing the segments locally. If I have 6months worth of data, is it possible for me to split the 5months of data to only live in deep-store such as S3 and only configure it to store 1-month worth of latest data
@Kishore G thanks that answers my question, wondering where is it in the pipeline in terms of features?
is it nearterm in any upcoming releases?
k
got it. we have the rough design for tiered storage. we haven't seen anyone specifically ask for lazy loading yet
p
thanks a lot, let me check it out. yeah having tiered storage would help storage/compute scale independently, atleast for us it's an interesting option. Would keep an eye out for this..