Hello, I am looking for general guidance here. We ...
# troubleshooting
p
Hello, I am looking for general guidance here. We are loading data for offline table from AWS S3 using the job spec for spark job. The segment size is ~ 400 MB on the disc. I am noticing that the servers run into OOM while trying to transitioning segment state after downloading it to the server disc. We are using 15 servers with 4 cpu and 32 GB ram and using 16 GB for heap and also using offheap. The servers have 2 TB disc each i.e. total of 30 TB disc space, and we are loading a total of 2 TB of data. We have also configured inverted index on top of some fields in the data.
@suraj sheshadri can you please share the parameters for the ingestion job?
s
These are the parameters used
p
Please note that the S3 bucket is in us-east-1 and the servers are in us-west-2. Both the bucket and the servers will be in the same region in the prod set up