https://pinot.apache.org/ logo
#general
Title
# general
u

udk

09/15/2020, 4:24 PM
Hi, I am trying to injest data into Pinot. The csv file is about 30G. It has been running for about 5 hours and has not completed yet. Could someone let me know where I can find the logs for this process. I have the pinot cluster running in docker containers - Setup similar to one described here - https://docs.pinot.apache.org/basics/getting-started/advanced-pinot-setup.
k

Kenny Bastani

09/15/2020, 4:30 PM
I don’t think it should take that long. @Xiang Fu should be able to advise here. As far as the log output goes, you should be able to see the results of
bin/pinot-admin.sh LaunchDataIngestionJob
-jobSpecFile /tmp/pinot-quick-start/batch-job-spec.yml
m

Mayank

09/15/2020, 4:31 PM
@udk Are you using pinot-admin.sh to create segment? If so, are you using one single input file of 30G? You might be better off breaking it into multiple smaller csv files (to get parallelism).
👍 1
u

udk

09/15/2020, 5:06 PM
Yes, I am using the pinot-admin.sh. It is one single input file. The console does not have much information about the job itself
@Mayank, Thanks for your help. I was able to create the segments by splitting the files
m

Mayank

09/15/2020, 11:49 PM
Thanks @udk for confirming. @Neha Pawar perhaps we can improve the docs to indicate the same?
u

udk

09/15/2020, 11:51 PM
It will be nice if the s/w does that. Created bug PINOT-11
m

Mayank

09/15/2020, 11:53 PM
I think we have that feature in pipeline as well
u

udk

09/15/2020, 11:59 PM
👍
n

Neha Pawar

09/16/2020, 4:29 PM
will update the doc to reflect this