Slackbot
06/26/2023, 1:34 PMJohn Kowtko
06/26/2023, 1:48 PMHagen Rother
06/26/2023, 1:51 PM"indexSpecForIntermediatePersists": {
"bitmap": {
"type": "roaring",
"compressRunOnSerialization": false
},
"dimensionCompression": "uncompressed",
"metricCompression": "none",
"longEncoding": "longs"
},
this is a legacy snipped that's been in there for a long time unquestioned. since it's complaining about intermediates. maybe it's here? So far maxRowsInMemory
had no effect, neither targetRowsPerSegment
Hagen Rother
06/26/2023, 1:54 PM"indexSpecForIntermediatePersists": {
"longEncoding": "auto"
},
and see if that helpsMaytas Monsereenusorn
06/26/2023, 4:53 PMHagen Rother
06/27/2023, 12:55 PMcount | size | type
-------+--------------+--------
486 | 227134537517 | "hdfs"
this is my input set. It has grown like a fungus over the years, ticket to reduce dimensions ongoing. Still need to deal with current state. According to the logs there are 174 columns. Defaulting indexSpecForIntermediatePersists
didn't help. Settings maxColumnsToMerge
to 5 gets me a lot further in before the OOMs happen. Will try with 2
now.Hagen Rother
06/28/2023, 8:21 AM[148206.690165] Memory cgroup out of memory: Killed process 2191202 (java) total-vm:28684636kB, anon-rss:13644824kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:28456kB oom_score_adj:0
the tasks still get killed... I guess I need to figure out, where this limit is enforced.Hagen Rother
06/28/2023, 8:23 AMjava.lang.OutOfMemoryError: Java heap space
at it.unimi.dsi.fastutil.longs.LongArrays.forceCapacity(LongArrays.java:108) ~[fastutil-core-8.5.4.jar:?]
at it.unimi.dsi.fastutil.longs.LongArrayList.grow(LongArrayList.java:281) ~[fastutil-core-8.5.4.jar:?]
at it.unimi.dsi.fastutil.longs.LongArrayList.add(LongArrayList.java:295) ~[fastutil-core-8.5.4.jar:?]
at org.apache.druid.segment.data.IntermediateColumnarLongsSerializer.add(IntermediateColumnarLongsSerializer.java:96) ~[druid-processing-26.0.0.jar:26.0.0]
at org.apache.druid.segment.LongColumnSerializer.serialize(LongColumnSerializer.java:91) ~[druid-processing-26.0.0.jar:26.0.0]
at org.apache.druid.segment.IndexMergerV9.mergeIndexesAndWriteColumns(IndexMergerV9.java:611) ~[druid-processing-26.0.0.jar:26.0.0]
at org.apache.druid.segment.IndexMergerV9.makeIndexFiles(IndexMergerV9.java:233) ~[druid-processing-26.0.0.jar:26.0.0]
at org.apache.druid.segment.IndexMergerV9.merge(IndexMergerV9.java:1155) ~[druid-processing-26.0.0.jar:26.0.0]
at org.apache.druid.segment.IndexMergerV9.multiphaseMerge(IndexMergerV9.java:1007) ~[druid-processing-26.0.0.jar:26.0.0]
at org.apache.druid.segment.IndexMergerV9.mergeQueryableIndex(IndexMergerV9.java:914) ~[druid-processing-26.0.0.jar:26.0.0]
at org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.mergeSegmentsInSamePartition(PartialSegmentMergeTask.java:352) ~[druid-indexing-service
-26.0.0.jar:26.0.0]
at org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.mergeSegmentsInSamePartition(PartialSegmentMergeTask.java:372) ~[druid-indexing-service
-26.0.0.jar:26.0.0]
at org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.mergeAndPushSegments(PartialSegmentMergeTask.java:260) ~[druid-indexing-service-26.0.0.
jar:26.0.0]
at org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.runTask(PartialSegmentMergeTask.java:191) ~[druid-indexing-service-26.0.0.jar:26.0.0]
at org.apache.druid.indexing.common.task.batch.parallel.PartialGenericSegmentMergeTask.runTask(PartialGenericSegmentMergeTask.java:41) ~[druid-indexing-service-26.0.0.
jar:26.0.0]
at org.apache.druid.indexing.common.task.AbstractTask.run(AbstractTask.java:173) ~[druid-indexing-service-26.0.0.jar:26.0.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:477) ~[druid-indexing-service-
26.0.0.jar:26.0.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:449) ~[druid-indexing-service-
26.0.0.jar:26.0.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
Error!
John Kowtko
06/28/2023, 12:47 PMHagen Rother
06/28/2023, 2:04 PMHagen Rother
06/28/2023, 2:05 PMHagen Rother
06/29/2023, 5:57 PMJohn Kowtko
06/29/2023, 6:00 PMHagen Rother
06/29/2023, 6:01 PMsingle_dim
- I guess I have to add moreHagen Rother
06/29/2023, 6:02 PMJohn Kowtko
06/29/2023, 6:16 PMHagen Rother
06/29/2023, 6:24 PMHagen Rother
06/29/2023, 6:26 PMHagen Rother
06/29/2023, 6:35 PMindex_parallel
will realize the imbalance and just created hashed subshards for the segments too big.Hagen Rother
06/29/2023, 6:36 PM