https://pinot.apache.org/ logo
m

Mayank

05/14/2021, 4:54 PM
@Jack Do we have a doc to describe the preprocessing for partition/sort before ingestion? If so could you share? If not, could we add the doc? cc: @Syed Akram
👌 1
j

Jack

05/14/2021, 5:05 PM
Hey @Syed Akram, yes we do have the design doc on preprocessing job, while it’s still in LinkedIn internal dir. Let me put it to the wiki page. In the meantime, you can refer to this file to see how it’s getting used: https://github.com/apache/incubator-pinot/blob/f2e3446e75f1ec1d553805d03f6504f05b3[…]/org/apache/pinot/hadoop/job/HadoopSegmentPreprocessingJob.java
m

Mayank

05/14/2021, 5:24 PM
@Jack If we can add it to docs.pinot.apache.org, that would be great
j

Jack

05/14/2021, 5:25 PM
Yeah that’s where I’m going to add to
m

Mayank

05/14/2021, 5:25 PM
thanks
s

Syed Akram

06/04/2021, 6:35 AM
above one is for raw input avro data, but my input data in orc