Hi I am trying to understand the size difference b...
# getting-started
g
Hi I am trying to understand the size difference between generated segments in deep storage vs segments in local disk. It seems after the ingestion job, segments for this table in s3 is close to 700GB, but the table size reported in pinot is around 4TB in UI (so 2TB for one data copy as we have replication factor 2). I wonder if this is expected? If it is, what is the main reason causing size difference? Is the data compressed in s3 and uncompressed in local?
n
Yes, that is one reason (compressed vs uncompressed) Other reason - the segment in deep store doesn't usually have indexes. Those are built by the server and live only on the server copy
g
I see I see, that makes sense. Also interested to know when will server build the index/what will trigger server to build the index? Are they built while segment is pushed into the server at the end of the ingestion job?
n
yes you’re right. When segment is pushed to server, before it is marked ONLINE in external view
👍 1
👀 1
n
@Neha Pawar. how it will be for real time table. Because data is already there in realtime server memory and then flushing to deep store right?