This message was deleted Apache Druid #troubleshooting

Join Slack

This message was deleted.

# troubleshooting

Slackbot

06/26/2023, 1:34 PM

This message was deleted.

John Kowtko

06/26/2023, 1:48 PM

Please share your task spec ... there may be task related parameters that can be adjusted (downwards) to reduce the amount of memory usage.

Hagen Rother

06/26/2023, 1:51 PM

Copy code

"indexSpecForIntermediatePersists": {
        "bitmap": {
          "type": "roaring",
          "compressRunOnSerialization": false
        },
        "dimensionCompression": "uncompressed",
        "metricCompression": "none",
        "longEncoding": "longs"
      },

this is a legacy snipped that's been in there for a long time unquestioned. since it's complaining about intermediates. maybe it's here? So far

maxRowsInMemory

had no effect, neither

targetRowsPerSegment

Hagen Rother

06/26/2023, 1:54 PM

I'll try with

Copy code

"indexSpecForIntermediatePersists": {
        "longEncoding": "auto"
      },

and see if that helps

Maytas Monsereenusorn

06/26/2023, 4:53 PM

what’s your setting for maxColumnsToMerge? Try reducing maxColumnsToMerge may help

Hagen Rother

06/27/2023, 12:55 PM

so my input are druid segments:

Copy code

count |     size     |  type
-------+--------------+--------
   486 | 227134537517 | "hdfs"

this is my input set. It has grown like a fungus over the years, ticket to reduce dimensions ongoing. Still need to deal with current state. According to the logs there are 174 columns. Defaulting

indexSpecForIntermediatePersists

didn't help. Settings

maxColumnsToMerge

to 5 gets me a lot further in before the OOMs happen. Will try with

now.

Hagen Rother

06/28/2023, 8:21 AM

Copy code

[148206.690165] Memory cgroup out of memory: Killed process 2191202 (java) total-vm:28684636kB, anon-rss:13644824kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:28456kB oom_score_adj:0

the tasks still get killed... I guess I need to figure out, where this limit is enforced.

Hagen Rother

06/28/2023, 8:23 AM

Copy code

java.lang.OutOfMemoryError: Java heap space
        at it.unimi.dsi.fastutil.longs.LongArrays.forceCapacity(LongArrays.java:108) ~[fastutil-core-8.5.4.jar:?]
        at it.unimi.dsi.fastutil.longs.LongArrayList.grow(LongArrayList.java:281) ~[fastutil-core-8.5.4.jar:?]
        at it.unimi.dsi.fastutil.longs.LongArrayList.add(LongArrayList.java:295) ~[fastutil-core-8.5.4.jar:?]
        at org.apache.druid.segment.data.IntermediateColumnarLongsSerializer.add(IntermediateColumnarLongsSerializer.java:96) ~[druid-processing-26.0.0.jar:26.0.0]
        at org.apache.druid.segment.LongColumnSerializer.serialize(LongColumnSerializer.java:91) ~[druid-processing-26.0.0.jar:26.0.0]
        at org.apache.druid.segment.IndexMergerV9.mergeIndexesAndWriteColumns(IndexMergerV9.java:611) ~[druid-processing-26.0.0.jar:26.0.0]
        at org.apache.druid.segment.IndexMergerV9.makeIndexFiles(IndexMergerV9.java:233) ~[druid-processing-26.0.0.jar:26.0.0]
        at org.apache.druid.segment.IndexMergerV9.merge(IndexMergerV9.java:1155) ~[druid-processing-26.0.0.jar:26.0.0]
        at org.apache.druid.segment.IndexMergerV9.multiphaseMerge(IndexMergerV9.java:1007) ~[druid-processing-26.0.0.jar:26.0.0]
        at org.apache.druid.segment.IndexMergerV9.mergeQueryableIndex(IndexMergerV9.java:914) ~[druid-processing-26.0.0.jar:26.0.0]
        at org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.mergeSegmentsInSamePartition(PartialSegmentMergeTask.java:352) ~[druid-indexing-service
-26.0.0.jar:26.0.0]
        at org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.mergeSegmentsInSamePartition(PartialSegmentMergeTask.java:372) ~[druid-indexing-service
-26.0.0.jar:26.0.0]
        at org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.mergeAndPushSegments(PartialSegmentMergeTask.java:260) ~[druid-indexing-service-26.0.0.
jar:26.0.0]
        at org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.runTask(PartialSegmentMergeTask.java:191) ~[druid-indexing-service-26.0.0.jar:26.0.0]
        at org.apache.druid.indexing.common.task.batch.parallel.PartialGenericSegmentMergeTask.runTask(PartialGenericSegmentMergeTask.java:41) ~[druid-indexing-service-26.0.0.
jar:26.0.0]
        at org.apache.druid.indexing.common.task.AbstractTask.run(AbstractTask.java:173) ~[druid-indexing-service-26.0.0.jar:26.0.0]
        at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:477) ~[druid-indexing-service-
26.0.0.jar:26.0.0]
        at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:449) ~[druid-indexing-service-
26.0.0.jar:26.0.0]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at java.lang.Thread.run(Thread.java:829) ~[?:?]
Error!

John Kowtko

06/28/2023, 12:47 PM

The error mentions cgroups ... are you deployed on some form of container, e.g. K8s, Docker, cgroups, etc?

Hagen Rother

06/28/2023, 2:04 PM

systemd uses cgroups too, but this is actually nomad. I didn't configure any mem limits, the only one I am aware of is the 32gb mx setting for the peons.

Hagen Rother

06/28/2023, 2:05 PM

I use the very same setup for all druid instances and their ram usage is much bigger, so quite certain it's not coming from nomad.

Hagen Rother

06/29/2023, 5:57 PM

After setting the mx to 64gb, the tasks finish. Looks like the culprit is an extremely uneven distribution of the sharding dimension, the smallest shard is 113.57 MB, the biggest 5.63 GB

John Kowtko

06/29/2023, 6:00 PM

What type of partitioning are you using? It sounds like Range, in which case adding more columns to the range key will increase cardinality of the key and allow the system to create more even partitions. I think the same goes for Hash too.

Hagen Rother

06/29/2023, 6:01 PM

just testing the waters with

single_dim

- I guess I have to add more

Hagen Rother

06/29/2023, 6:02 PM

is the order important?

John Kowtko

06/29/2023, 6:16 PM

The order is important for querying purposes ... your query must filter on the first N columns of the range key in order allow for query pruning. So if you set your range columns to {colA, colB, colC}, then you will get pruning only if the query filters on colA, or {colA,colB}, etc.

Hagen Rother

06/29/2023, 6:24 PM

Makes sense. Just wanted to ask to be sure.

Hagen Rother

06/29/2023, 6:26 PM

so basically I add what I think will help the most for users and then add some to help the sharding?

Hagen Rother

06/29/2023, 6:35 PM

It would be nice if

index_parallel

will realize the imbalance and just created hashed subshards for the segments too big.

Hagen Rother

06/29/2023, 6:36 PM

but I accept the underlying complexity, it's a lot easier to work on a specific use case. In my case, above would help, current state forces me into a week of increasing mx until it completes to then make it obvious, what's going wrong.

Open in Slack

Previous Next