Hi Team, We need to some preprocessing of the record before running ingestion job - create pinot segments outside and push to Pinot data store, what is the way in Pinot we can run some spark job (preprocessing) before running actual ingestion job for pinot segments.
09/14/2021, 12:43 PM
What’s the preprocessing you need to do? Currently the spark ingestion job in Pinot does not support custom preprocessing
09/14/2021, 12:48 PM
we already have on preprocessing spark job - This spark job o/p goes to HDFS
We have ingestion job read HDFS create segment and push to pinot data store.
we want to create pipeline process data after that create segment and push to pinot
09/14/2021, 2:06 PM
If you already have input for create segment you can just use ingestion job to create segment and push to Pinot right?