https://pinot.apache.org/ logo
r

RK

05/28/2021, 8:00 AM
This is the config files details. I checked classpath_prefix value also it's pointing the correct jar locations. Kindly help.
x

Xiang Fu

05/28/2021, 8:02 AM
Have you pushed the segment to offline table or it’s a realtime table?
r

RK

05/28/2021, 8:02 AM
It's a realtime table
x

Xiang Fu

05/28/2021, 8:03 AM
ok, so you can query the data now
r

RK

05/28/2021, 8:03 AM
Yes I can query
x

Xiang Fu

05/28/2021, 8:03 AM
are there any segment in
ONLINE
status or they are all in
CONSUMING
status
r

RK

05/28/2021, 8:04 AM
All in consuming
x

Xiang Fu

05/28/2021, 8:04 AM
ah
then you need to wait for a while
those segments are in memory at consuming status
once reach the threshold, pinot will persist them and upload
then those segments will be in
ONLINE
status
r

RK

05/28/2021, 8:05 AM
Ok is there any time limit which we can set because since last half an hour all are in consuming state.
x

Xiang Fu

05/28/2021, 8:06 AM
in table config, you can set flush threshold
You can modify those:
Copy code
"realtime.segment.flush.threshold.size": "0",
      "realtime.segment.flush.threshold.time": "24h",
     "realtime.segment.flush.desired.size": "50M",
r

RK

05/28/2021, 8:07 AM
This is the segment matadata I can see
x

Xiang Fu

05/28/2021, 8:07 AM
make
realtime.segment.flush.threshold.time
to
5m
then it will persist every 5mins
r

RK

05/28/2021, 8:08 AM
Ohh ok got it..thanks a lot @Xiang Fu.let me update this.
x

Xiang Fu

05/28/2021, 8:08 AM
I think your data volume is not very high, so the consuming segment need to wait a lot of time
also we don’t recommend to have many segments in a table
so suggest to change back to the default config once you finished the test
r

RK

05/28/2021, 8:09 AM
Oh ok I have only 4 segments
Ok
x

Xiang Fu

05/28/2021, 8:09 AM
I think you mean 4 kafka partitions?
r

RK

05/28/2021, 8:09 AM
Yes
x

Xiang Fu

05/28/2021, 8:10 AM
right, 5mins a segment means pinot will generate 12 * 24 * 4 = 1152 segments per day
suggest to keep per table segment number < 20k
r

RK

05/28/2021, 8:13 AM
Ohh ok so better in realtime will keep high threshold value.
Ok @Xiang Fu
x

Xiang Fu

05/28/2021, 8:18 AM
right, try to keep segment size to be hundreds mb-ish
r

RK

05/28/2021, 8:25 AM
Okay
@Xiang Fu Thanks a lot deepatorage worked now.
@Xiang Fu what is the process to load these hdfs segment files again in pinot .in case I want to recover the lost segments in Pinot.? Do I need to follow the same steps as batch load?
x

Xiang Fu

05/30/2021, 1:24 AM
do you mean pinot table is deleted?
you can try to create a pinot batch table and upload those segments
run batch ingestion job
r

RK

05/30/2021, 4:31 AM
@Xiang Fu means let's say I have 20 segments in pinot table and out of those 20 segments 25 are pushed into hdfs location also. I now I have deleted one of the segment from pinot table i.e. 5th segment and now I want to restore this 5th (lost segment ) from hdfs location.
x

Xiang Fu

05/30/2021, 6:53 AM
hmm, if you explicitly delete the segment, then it will also be deleted from hdfs
r

RK

05/30/2021, 8:03 AM
Ohh okay @Xiang Fu then what is the use of storing those segments on hdfs ?.I means in what all cases we can recover or use those stored segments.?
What I am planning to do that ,I am trying to create the flow so the way we have stored the pinot segment on hdfs as backup same segment we can use in pinot in case of some issue with pinot table I.e. some segment lost etc.so what is the process to recover the lost segment from hdfs.
x

Xiang Fu

05/30/2021, 8:26 AM
hmm, in what circumstance you will find the segment lost? You can try to backup pinot segments periodically from the deep store hdfs directory to another hdfs directory which won’t be deleted by pinot
r

RK

05/30/2021, 9:03 AM
hmm..actually we are implementing this end to end pipeline as part of Poc to use in one of our use case.I have completed pipeline till deepatorage what I am bit confused here that how will I use these stored segment from here. What is the use of these stored segments?and if I want to load these segments again in pinot table then. What will be the process to load these segments back to pinot @Xiang Fu
x

Xiang Fu

05/30/2021, 9:08 AM
You can treat deep store as part of pinot ecosystem, which holds all the current serving segments. Pinot will delete segments from deep store, if user issues segment deletion request or retention kicks in. If you want to backup the pinot table, you need to backup the segments by yourself. E.g. copy the segments from the deep store to another hdfs directory.
r

RK

05/31/2021, 5:40 AM
Okay @Xiang Fu got it thanks alot