This message was deleted.
# troubleshooting
s
This message was deleted.
j
Please share your task spec ... there may be task related parameters that can be adjusted (downwards) to reduce the amount of memory usage.
h
Copy code
"indexSpecForIntermediatePersists": {
        "bitmap": {
          "type": "roaring",
          "compressRunOnSerialization": false
        },
        "dimensionCompression": "uncompressed",
        "metricCompression": "none",
        "longEncoding": "longs"
      },
this is a legacy snipped that's been in there for a long time unquestioned. since it's complaining about intermediates. maybe it's here? So far
maxRowsInMemory
had no effect, neither
targetRowsPerSegment
I'll try with
Copy code
"indexSpecForIntermediatePersists": {
        "longEncoding": "auto"
      },
and see if that helps
m
what’s your setting for maxColumnsToMerge? Try reducing maxColumnsToMerge may help
h
so my input are druid segments:
Copy code
count |     size     |  type
-------+--------------+--------
   486 | 227134537517 | "hdfs"
this is my input set. It has grown like a fungus over the years, ticket to reduce dimensions ongoing. Still need to deal with current state. According to the logs there are 174 columns. Defaulting
indexSpecForIntermediatePersists
didn't help. Settings
maxColumnsToMerge
to 5 gets me a lot further in before the OOMs happen. Will try with
2
now.
Copy code
[148206.690165] Memory cgroup out of memory: Killed process 2191202 (java) total-vm:28684636kB, anon-rss:13644824kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:28456kB oom_score_adj:0
the tasks still get killed... I guess I need to figure out, where this limit is enforced.
Copy code
java.lang.OutOfMemoryError: Java heap space
        at it.unimi.dsi.fastutil.longs.LongArrays.forceCapacity(LongArrays.java:108) ~[fastutil-core-8.5.4.jar:?]
        at it.unimi.dsi.fastutil.longs.LongArrayList.grow(LongArrayList.java:281) ~[fastutil-core-8.5.4.jar:?]
        at it.unimi.dsi.fastutil.longs.LongArrayList.add(LongArrayList.java:295) ~[fastutil-core-8.5.4.jar:?]
        at org.apache.druid.segment.data.IntermediateColumnarLongsSerializer.add(IntermediateColumnarLongsSerializer.java:96) ~[druid-processing-26.0.0.jar:26.0.0]
        at org.apache.druid.segment.LongColumnSerializer.serialize(LongColumnSerializer.java:91) ~[druid-processing-26.0.0.jar:26.0.0]
        at org.apache.druid.segment.IndexMergerV9.mergeIndexesAndWriteColumns(IndexMergerV9.java:611) ~[druid-processing-26.0.0.jar:26.0.0]
        at org.apache.druid.segment.IndexMergerV9.makeIndexFiles(IndexMergerV9.java:233) ~[druid-processing-26.0.0.jar:26.0.0]
        at org.apache.druid.segment.IndexMergerV9.merge(IndexMergerV9.java:1155) ~[druid-processing-26.0.0.jar:26.0.0]
        at org.apache.druid.segment.IndexMergerV9.multiphaseMerge(IndexMergerV9.java:1007) ~[druid-processing-26.0.0.jar:26.0.0]
        at org.apache.druid.segment.IndexMergerV9.mergeQueryableIndex(IndexMergerV9.java:914) ~[druid-processing-26.0.0.jar:26.0.0]
        at org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.mergeSegmentsInSamePartition(PartialSegmentMergeTask.java:352) ~[druid-indexing-service
-26.0.0.jar:26.0.0]
        at org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.mergeSegmentsInSamePartition(PartialSegmentMergeTask.java:372) ~[druid-indexing-service
-26.0.0.jar:26.0.0]
        at org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.mergeAndPushSegments(PartialSegmentMergeTask.java:260) ~[druid-indexing-service-26.0.0.
jar:26.0.0]
        at org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.runTask(PartialSegmentMergeTask.java:191) ~[druid-indexing-service-26.0.0.jar:26.0.0]
        at org.apache.druid.indexing.common.task.batch.parallel.PartialGenericSegmentMergeTask.runTask(PartialGenericSegmentMergeTask.java:41) ~[druid-indexing-service-26.0.0.
jar:26.0.0]
        at org.apache.druid.indexing.common.task.AbstractTask.run(AbstractTask.java:173) ~[druid-indexing-service-26.0.0.jar:26.0.0]
        at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:477) ~[druid-indexing-service-
26.0.0.jar:26.0.0]
        at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:449) ~[druid-indexing-service-
26.0.0.jar:26.0.0]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at java.lang.Thread.run(Thread.java:829) ~[?:?]
Error!
j
The error mentions cgroups ... are you deployed on some form of container, e.g. K8s, Docker, cgroups, etc?
h
systemd uses cgroups too, but this is actually nomad. I didn't configure any mem limits, the only one I am aware of is the 32gb mx setting for the peons.
I use the very same setup for all druid instances and their ram usage is much bigger, so quite certain it's not coming from nomad.
After setting the mx to 64gb, the tasks finish. Looks like the culprit is an extremely uneven distribution of the sharding dimension, the smallest shard is 113.57 MB, the biggest 5.63 GB
j
What type of partitioning are you using? It sounds like Range, in which case adding more columns to the range key will increase cardinality of the key and allow the system to create more even partitions. I think the same goes for Hash too.
h
just testing the waters with
single_dim
- I guess I have to add more
is the order important?
j
The order is important for querying purposes ... your query must filter on the first N columns of the range key in order allow for query pruning. So if you set your range columns to {colA, colB, colC}, then you will get pruning only if the query filters on colA, or {colA,colB}, etc.
h
Makes sense. Just wanted to ask to be sure.
so basically I add what I think will help the most for users and then add some to help the sharding?
It would be nice if
index_parallel
will realize the imbalance and just created hashed subshards for the segments too big.
but I accept the underlying complexity, it's a lot easier to work on a specific use case. In my case, above would help, current state forces me into a week of increasing mx until it completes to then make it obvious, what's going wrong.