Apache Pinot

Hello
What is the recommended (prod) way of ingesting batch data without Hadoop ?
I'm thinking about having a Python component generate parquet files + copy on deepstore, and triggering an ingestion
Something like the `/ingestFromFile` API endpoint but prod-compatible (where can segment creation be done in that case ? Minion ?)

Thanks !

I’m guessing Minion would work well for that use case, but I haven’t tried that.
We just run shell scripts to trigger the segment generation job, which uses HDFS for input (csv) and output (segment) directories.

Then a script executes a “URI push” job, which will use hdfs URIs to do a more efficient load of segments. Though you need to set up and use controller &amp; server config files to configure hdfs as a valid file system for URIs.

I see, thanks for the feedback <@U01ECR21TEE>
I'll have a look at using the minion, otherwise we'll use a shell script as you did :slightly_smiling_face:

We are working on a solution where Minion can do the ingestion, but not ready yet. cc: <@UDT7GFEG6>