Reaching out here to ask if anyone knows about the...
# community-support
s
Reaching out here to ask if anyone knows about the ins-and-outs of running gradle inside a docker container in build CI. It's a topic that's been covered a dozen times around stack overflow and the forums but never seems to have a satisfying answer (either no responses, or "turn off parallel builds" and similar non-solutions, or the threads are super out of date). It seems that something weird happens with memory management when running in a docker environment. On local developer machines (macbook with 18gb memory) a gradle build on a large project with
-Xmx4g
setting on the gradle daemon tends to only use about 5gb max between the main daemon process and all child processes. Looking at a build scan it appears to rarely reach that amount and spends most of the build using around 2-4gb of memory. Running htop on the macbook while the build runs shows that the total used memory on the machine (including all other apps, browser, slack, etc) hovers stably around 5-7gb. Meanwhile if I run exactly the same build, with exactly the same gradle properties (specifying the same number of max workers) inside a docker image on the same macbook (docker configured with 10gb max memory), all of a sudden that build continues to eat more and more memory, eventually blowing past 8-9gb and eventually using all 10gb available resulting the daemon being killed in the background. I'm not sure what to make of it. It implies there's something going on either in the OS (ubuntu vs macOS) or in docker, or the JVM, or in gradle itself. I've tried dozens of different combinations of JVM options and flags, I'm running Java 21 for the build toolchain on top of a Java 23 installation so flags like
UseContainerSupport
should be on by default. I've toyed with ensuring that
MaxMetaspace
is set when overriding
org.gradle.jvmargs
, I've set Java options globally for all java instances to try to constrain memory of child processes, etc. None of it has seemed to have any effect.
For reference the docker image used above was one based on
eclipse-temurin
(23 specifically). I just tried another image with a different linux base (amazon corretto
24-jdk
on amazon linux), it appeared a bit more stable. It filled the memory up to 9.5gb (9.7gb available) but it appeared to perform more garbage collections to keep it there. It got significantly further in the build. The main gradle daemon was using over 8gb of memory though, twice the configured 4gb. Eventually something pushed it over the edge though and it was also killed due to memory exhaustion
y
I have seen this kind of behaviour in (Gitlab) CI with large sets of tests, and the answer was to configure
forkEvery
to a low value when running in CI.
s
I'll give that a go to see if it helps, but I'm also more concerned with the root cause. It makes no sense to me why Gradle would use so much more memory across a bare metal macbook and a docker image with the same parameters, nor why the main gradle build daemon seems to actively ignore its JVM args and consumes more than twice the maximum memory it has been configured with
Example: PID 120 is the Gradle Daemon PID 782 is the Kotlin Daemon The Gradle Daemon is configured to use MaxRAMPercentage=40, which it recognises as 3.1gb when it starts up. The Kotlin Daemon is configured to use MaxRAMPercentage=30, and inherits a roughly 3.1g Xmx setting from the Gradle Daemon. Both processes are using more than they should be. 782 peaked at 44% memory used
Even ensuring that metaspace is constrained doesn't seem to have an effect. After setting Xmx3g and MaxMetaspaceSize=1g, the Gradle Daemon ended up filling up 5.5gb of memory
it appears to be the gradle daemon that is most responsible for filling any available memory. The Kotlin daemon does use up to about 2gb typically but it'll let a lot of it go later as more instances spawn. Across multiple kotlin daemon and other java instances the combined memory usage doesn't seem to go above 2gb. But the Gradle daemon just keeps going up and up on its own
j
You could use a tool such as
jconsole
to check what the actual memory usage and limits are from the JVM point of view. This will also tell you how close to the limit your build gets. Note that you have 3 different JVMs running, each of them with a different place to set memory limits: • the
JAVA_OPTS
environment variable for the launcher, but you should not need to set any limit there • the
org.gradle.jvmargs
property for the Gradle daemon • the
kotlin.daemon.jvmargs
property for the Kotlin daemon. I would recommend settings limits with
-Xmx
and similar rather than RAM %. The Kotlin daemon inherits some arguments from the Gradle daemon, but won't take into account the RAM already used by Gradle. If other JVM processes are launched by the build (e.g. to run tests) they may have their own arguments defined elsewhere in build scripts. Finally you have to leave some room (up to 50% in constrained environments such as yours) for OS/filesystem/other processes overhead, which will also be accounted as RAM usage by the container.
s
I've tried both Xmx and ram percentage, both appear to be ignored. The Kotlin Daemon is supposed to inherit its settings from the gradle jvm args, but I've also tried setting a dedicated value on it independently. I've also set global
JAVA_TOOL_OPTIONS
memory limits. All of the above appear to be straight up ignored in the case of the gradle daemon. There's definitely more to JVM memory footprint than just the heap, but with the heap constrained to 2gb and metaspace constrained to 1gb, I certainly wouldn't expect it to be using a total of 6gb of memory, insinuating that if the heap and metaspace are properly constrained then there's 3gb of other stuff being allocated
I've just run the build in a docker environment with a full 16gb of memory, with an Xmx of 3g and metaspace of 1g which should be inherited by the kotlin daemon. OOMKilled
j
Depending on how large the build is I would not be surprised by a +100% (i.e. 50% of total RAM) overhead. I also note that on your screenshots processes are run through
rosetta
and that probably adds some significant overhead as well. Your try with 3g+1g means both JVMs could allocate up to 8 GiB together, so one of them ending up OOM-killed in such a situation is not really surprising either. Did you try
jconsole
to check what it looks like from within the JVM, and see if you could further restrict memory allocation?
s
I haven't been able to connect jconsole to the gradle daemon instance specifically. As for the combined usage, what I'm actually seeing is that the Kotlin Daemon and other instances are using pretty limited memory and the Gradle Daemon itself is the one that is growing to 200% of its limit
jconsole was able to connect to the gradle wrapper jvm instance though weirdly
the memory usage patterns are far more visible at larger total memory footprints. If I set the container to have 12gb or more of memory and open up the memory limits of Gradle and Kotlin to 4gb each, the kotlin daemon tends to hover at 2gb or lower and the gradle daemon rockets up to 8gb