tl dr Do you have a Gradle build or plugin that would benefi Gradle Community #community-support

tl;dr: Do you have a Gradle build or plugin that w...

Adam

08/26/2025, 5:00 PM

tl;dr: Do you have a Gradle build or plugin that would benefit from a fine-grained caching tool? What would you use it for? I've been tinkering with a Kotlin Native project that compiles C code, but DevEx was bad because compilation was really slow, for two main reasons: 1. Any changes to the buildscripts, even if they were unrelated, meant a new build-cache key, which means a long, slow recompile of everything. 2. Build cache was too coarse, and the compile-C task would re-compile everything even if only 1 file out of 1000 changed. Or the changes in the sources were not relevant (like only changing a comment). I kept getting annoyed because any changes in the buildscripts would trigger recompilation of everything, which was really slow. So, I created a custom caching daemon! It supports caching individual operations (e.g. compiling a single file, or creating an archive). The cache key only considers the actual inputs, so it's not sensitive to buildscript changes. Another benefit: the parallelism can be more tightly controlled, no matter how many subprojects/tasks/files there are. It's still messy, but as a POC it's working well in my hobby project. I'm considering splitting it out as a separate library, but before I do I really wanted to get some more information: If I released a per-file caching library as Gradle plugin, how would you use it?

👀 1

Adam

08/26/2025, 5:00 PM

Technical details How it works: I launch a separate JVM process and use RocksDB to individually cache operations. Because the cacher runs in a separate process the classpath is isolated, and can be used as part of the cache key - no more buildscript sensitivity! It uses unix domain sockets for communication (in theory it could use network sockets as well, for remote caching). Alternatives This wasn't the first approach I tried. I tried some alternatives, but they were lacking: • Incremental tasks are nice, but they can't reload historical results, and are too sensitive to buildscript changes. • Build cache is too coarse-grained (I wanted per-file, not per-task, caching), and is also too sensitive to unrelated buildscript changes.

Julien Plissonneau Duquène

08/26/2025, 5:53 PM

Did you consider integrating an existing caching tool like ccache?

Adam

08/26/2025, 7:46 PM

yes, ccache was a source of inspiration. I didn't want to use it though, because I don't want manual install prerequisites (i.e. no "to use this plugin you must first install `ccache`"). I wanted something that just worked, i.e. a regular JVM dependency. And I find custom tools management in Gradle requires a lot of manual work for little things that come for free for a JVM dependency (determining the right file to download based on the OS/arch, checking for new versions, caching the download, unpacking, checking the install is valid, deleting old/unused versions). Also, I'm using the run_konan util distributed with Kotlin/Native. I wasn't confident I could set that up correctly to work with ccache ([despite the ccache docshttps://ccache.dev/manual/4.11.2.html#_using_ccache_with_other_compiler_wrappers)). Finally, I liked the challenge :)

Martin

08/26/2025, 9:36 PM

How does the cache key computation work? Do I have to manually hash the input files, etc..?

Martin

08/26/2025, 9:39 PM

Am I right that it's looking like this? • The buildscript classpath changes • Gradle invalidates the task • reruns it • the task implementation sees that the actual inputs (excluding buildscript classpath) haven't changed and fetches the matching outputs • Gradle sees the task as

SUCCESS

(despite no work actually executing)

Martin

08/26/2025, 9:40 PM

This is a compelling proposition but the friction is high. Basically, everyone would have to learn this mental model, which is probably going to come as a surprise to many. It makes the debugging/reading of build scans, etc.. harder

Martin

08/26/2025, 9:41 PM

Also probably fingerprints the inputs twice? Unless you can get the Gradle values somehow?

Adam

08/26/2025, 9:57 PM

How does the cache key computation work? Do I have to manually hash the input files, etc..?

In my project I use clang to pre-process each of the C source files, which is ideal for checksumming (changes in unused header files won't be reflected in the pre-processed output, and other changes (like comments) will be ignored.) So, yes, if the files are marked as inputs then fingerprints twice. Although, slightly differently. Gradle fingerprints the 'plain' file content, while my custom cache can use a more specialised fingerprinting. My custom cacher doesn't have to just be used in tasks though (it could be used in the configuration phase), and the files don't have to be registered as inputs (so Gradle won't finger print them), and add a custom

outputs.upToDateWhen { cacher.isUpToDate(files) }

check.

Martin

08/26/2025, 9:58 PM

files don't have to be registered as inputs (so Gradle won't finger print them)

Just bypass everything 😄

Martin

08/26/2025, 9:58 PM

I like this

Martin

08/26/2025, 10:02 PM

I would experiment this. Not sure I'd deploy it but if can help advocate about the fact that invalidating all the time is ultra painful, it's worth it

Vampire

08/27/2025, 9:12 AM

Two points: 1. Yes, more fine-grained cacheability is definitely needed, there is an open feature request by me: https://github.com/gradle/gradle/issues/31482. Feel free to thumbs-up it. I only speak about work-action cacheability there increase the likeliness it gets attention and for me it would anyway be appropriate to pack in one work-action what should be cached together and that would also bring the benefit of parallel execution of the action. 2. If your task result gets invalidated by each change to the build script even if it is unrelated, then only because it is not unrelated but your build script is input for the task logic. As long as your buildscript is not adding to the task logic, it is not part of the input. If you for example have

Copy code

val foo by tasks.registering(Help::class) {
    outputs.file("gradlew")
    outputs.cacheIf { true }
}

then this task will stay up-to-date or can also be taken from cache, no matter what you change outside this definition in the build script (unless you there you add to the logic of that task. If on the other hand you have

Copy code

val foo by tasks.registering(Help::class) {
    outputs.file("gradlew")
    outputs.cacheIf { true }
    doLast {
        println("foo")
    }
}

then the build script is defining a part of the task logic and thus becomes an input for the task. Gradle cannot (or at least does not yet) differentiate which part of the build script changed, whether it is a part of the logic for that task or whether it is an unrelated change. You can also see what is part of the cache key and thus what is considered an input by using

-Dorg.gradle.caching.debug=true

. If your build script does not contribute to the logic of the task but still is considered an input for up-to-dateness and cache-key, then I'd say that is a bug you should report and that needs to be fixed.

Martin

08/27/2025, 9:17 AM

I think the typical case is a convention plugin. All the tasks are just in your

build-logic:runtimeClasspath

so any change there invalidates everything

Vampire

08/27/2025, 9:36 AM

He did not complain that any change in the classpath invalidates and did not speak about convention plugins. That would be true for 3rd party tasks (not for built-in tasks) and could be mitigated by splitting things into multiple projects if necessary. He did complain that any change in the buildscript invalidates the task, and that is just not correct if the buildscript does not contribute to the build logic or otherwise is a bug. And even if it is about a precompiled script plugin and not a buildscript, the same applies. The shown example coming from a convention plugin behaves just the same as none of the convention plugin build contributes to the task logic. The tasks logic is only coming from the built-in

Help

task, so the inputs will not change even if you change the convention plugin. Even moving that task from the build script to the convention plugin it will stay up-to-date as the inputs are identical. If you of course talk about a custom task or a 3rd party task and so the runtime classpath of the task changes, then of course it is an input and that is good, because a change in runtime classpath can mean a different result, so the output must not be reused.

Martin

08/27/2025, 9:47 AM

Agreed, the description of the problem could be more accurate. But that shouldn't hide the fact that it is a very widespread problem.

Vampire

08/27/2025, 9:51 AM

Again, I don't see where the problem is. If you change the project where a task is implemented, it must be out-of-date. If you stuff 100 tasks in one project, then of course all tasks are out-of-date if you change only one of them. If you don't want that, split the tasks in multiple projects. The complaint here was, that changing the buildscript makes the task out-of-date and that is simply wrong unless the buildscript influences that task logic in which case this is a necessity. And if it did not it is a bug. That more fine-grained cacheability would be nice is out of the question. I'd also like if an incremental task is non-incremental for some reason, that the individual actions can be served from cache as described in my feature request. 🙂

Martin

08/27/2025, 9:53 AM

Replace

buildscript

buildscript classpath

and consider you have a single one in your root project or settings

Martin

08/27/2025, 9:53 AM

This

buildscript classpath

contains all your tasks implementations

Martin

08/27/2025, 9:53 AM

Changing one of them is invalidating all of your build

Martin

08/27/2025, 9:54 AM

This is my problem and I believe OP's problem as well but can't speak to OP obviously

Vampire

08/27/2025, 10:01 AM

Ah, right, splitting into multiple projects will probably only help if the tasks are used in different projects. Having all plugins of the whole build in a single classpath is anyway bad-practice in my opinion as you know. 😄 Here you have another reason for it. Needless out-of-date tasks just because one plugin was changed. 🙂

Martin

08/27/2025, 10:02 AM

Having all plugins of the whole build in a single classpath is anyway bad-practice in my opinion as you know.

Ah, I didn't know that!

Martin

08/27/2025, 10:03 AM

But if you don't do that, you end up either with broken conflict resolution and/or malfunctioning BuildServices

Martin

08/27/2025, 10:04 AM

So it's like chosing between 2 evisl

Vampire

08/27/2025, 10:05 AM

Ah, I didn't know that!

Oh, sorry, thought you were part of the last discussion about this here. My opinion is, just use what you use where you use it and only do things like

apply false

if there is a technical necessity, but not just declaring all in the root project just for the sake of doing it.

Martin

08/27/2025, 10:05 AM

I'm part of many discussions 😄 Might have missed that one

Martin

08/27/2025, 10:06 AM

Not doing the

apply false

thing is pretty dangerous

Martin

08/27/2025, 10:06 AM

The BuildService issue only makes it a nogo for me

Martin

08/27/2025, 10:07 AM

And TBH I think 90% of builds have

apply false

somewhere, it's in the Google recommendations, etc..

Vampire

08/27/2025, 10:07 AM

Nah, I remembered right, this is the discussion I referred to: https://gradle-community.slack.com/archives/CAHSN3LDN/p1754935180286129 🙂

Vampire

08/27/2025, 10:08 AM

Not doing the

apply false

thing is pretty dangerous As well as doing the

apply false

thing is pretty dangerous Let's not again replicate that other discussion now here, this thread is hijacked enough. 😄 Whatever you do, you do it wrong, and you will sooner or later hit problems, not matter which way you follow as there is no patent recipe that works for all cases.

Martin

08/27/2025, 10:10 AM

Exactly, you can choose your illness 😅

Martin

08/27/2025, 10:10 AM

In french we have this saying of chosing between "la peste" and "le cholera"

Vampire

08/27/2025, 10:11 AM

Same in German, "Pest oder Cholera"

🇩🇪 1

🤝 1

Martin

08/27/2025, 10:11 AM

I'll stick that a long term solution of "single buildscript classpath for wiring (no dependencies)" + "isolated workers" would be really nice

Martin

08/27/2025, 10:12 AM

no more illnesses

Vampire

08/27/2025, 10:14 AM

No, just different ones 🙂

Martin

08/27/2025, 10:15 AM

"single buildscript classpath for wiring (no dependencies)" + "isolated workers" means you have a healthy life. The only illness is that we need to convert all existing plugins to use classloader isolation

Martin

08/27/2025, 10:16 AM

But it's not a illness, ti's more like exercise to keep you fit

Adam

08/28/2025, 6:14 AM

"single buildscript classpath for wiring (no dependencies)" + "isolated workers" means you have a healthy life

Except classloader isolation causes OOM https://github.com/gradle/gradle/issues/18313. Process isolation doesn't leak, but it's often quite resource intensive. 😬

Martin

08/28/2025, 6:27 AM

@Adam you can do classloader isolation without the leaking Worker API

Martin

08/28/2025, 6:29 AM

Or the Worker API can be fixed too

Adam

08/28/2025, 9:41 AM

you can do classloader isolation without the leaking Worker API

Oh, do you mean with a manual classloader? We had to do that recently in Dokka

Martin

08/28/2025, 9:42 AM

Yea, I'm doing the same everywhere now

Martin

08/28/2025, 9:42 AM

Kotlin is also doing the same, let me dig that link

Martin

08/28/2025, 9:43 AM

https://github.com/Jetbrains/kotlin/blob/90c9cadeb446f0cb52e871e7f7ca00cb890213f0/[…]rains/kotlin/gradle/internal/ClassLoadersCachingBuildService.kt

Martin

08/28/2025, 9:44 AM

KSP doesn't even bother about a BuildService ^^

Martin

08/28/2025, 9:44 AM

TBH I don't blame them considering all the issues with BuildServices

Adam

08/28/2025, 9:45 AM

yeah, makes sense. In which case: why bother using the Worker API at all? :) (Although, a word of caution: I was warned off using ClassLoadersCachingBuildService since Dokka heavily uses coroutines, and modifying the classloader apparently affects coroutines mechanisms?)

Martin

08/28/2025, 9:45 AM

why bother using the Worker API at all?

parallelism for people without configuration cache

Martin

08/28/2025, 9:46 AM

Ultimately I wish the Worker API goes away

Martin

08/28/2025, 9:46 AM

It duplicates a lot of the

Task

concepts

Martin

08/28/2025, 9:46 AM

without the caching

Martin

08/28/2025, 9:47 AM

modifying the classloader apparently affects coroutines mechanisms

I'd say there should be a way to make this work. I'd be willing to investigate this if you have a reproducer

Vampire

08/28/2025, 9:48 AM

Sorry, but that's a bit non-sense. I don't think worker api will go anywhere. It is not only about parallelism. It is also useful if you need to execute something with a different JDK, or otherwise have a need to do something in a different process that can be reused though. And the parallelism is also important when using CC, because CC only runs tasks in parallel while work items from within the same task action can run in parallel using the worker api. The missing cacheability for work items is bad I agree, thus the issue I linked above. The worker api does not really duplicate the task concepts, it is a sub-unit of the work of a task.

Vampire

08/28/2025, 9:49 AM

Also once the memory leak is fixed, it can be used for the classloader isolation without the need for manually fiddling around with classloaders.

Martin

08/28/2025, 9:51 AM

Yes, process isolation is also a use case although not one I am interested in

Martin

08/28/2025, 9:51 AM

while work items from within the same task action can run in parallel using the worker api.

you might as well use coroutines for that or an Executor or whatever primitive you want. This problem has been solved without Workers

Martin

08/28/2025, 9:52 AM

once the memory leak is fixed, it can be used for the classloader isolation without the need for manually fiddling around with classloaders

Yea, that'd be good. Although I will probably keep doing the manual thing because the Worker classloaders are not completely isolated. They will force the kotlin-stdlib version for an example

Vampire

08/28/2025, 9:53 AM

Of course, you can also build your software without Gradle, that problem was solved long before Gradle. It just makes it easier and needs less reinventing the wheel. 🙂 And coroutines are useless, because I would never write a public plugin in Kotlin. 🙂

Adam

08/28/2025, 9:55 AM

I'd say there should be a way to make this work. I'd be willing to investigate this if you have a reproducer

I'll ask. I think it was related to ThreadLocals. It was more of a theoretical concern than something we found.

👍 1

Martin

08/28/2025, 9:56 AM

Oh yea, ThreadLocals + classloaders can be fun 😅

Adam

08/28/2025, 10:00 AM

Yes, process isolation is also a use case although not one I am interested in

(more back to my original topic) What about a single, separate process that could run workers, and the process was shared for all subprojects? Would that be interesting for you?

Martin

08/28/2025, 10:02 AM

TBH not sure. Managing one JVM heap is already some work 😅

Vampire

08/28/2025, 10:09 AM

What about a single, separate process that could run workers, and the process was shared for all subprojects? Would that be interesting for you?

Sounds like a shared build service which you can give work to do. 🙂

Martin

08/28/2025, 12:15 PM

What about a single, separate process that could run workers, and the process was shared for all subprojects?

Sorry was in a bit of a rush earlier. To ellaborate a bit more, I have no specific interest in process isolation unless it is otherwise technically required. The two things I want: • use dependencies that potentially conflict with other plugins • do not invalidate my task if something else in the buildscript classpath changes. I think both of those can be fixed without spawning a separate process. If a separate process helps making this a reality then I'd consider it but it's not a requirement per se.

👍 1

Adam

08/29/2025, 10:14 PM

> I'd say there should be a way to make this work. I'd be willing to investigate this if you have a reproducer Follow up about classloader caching being delicate: Both Analysis Api and Coroutines have 'shutdown' mechanisms. • https://github.com/JetBrains/kotlin/blob/v2.2.20-RC/analysis/analysis-api-standalone/src/org/jetbrains/kotlin/analysis/api/standalone/StandaloneAnalysisAPISessionBuilder.kt#L251-L268 • https://github.com/Kotlin/kotlinx.coroutines/blob/1.10.2/kotlinx-coroutines-core/jvm/src/Dispatchers.kt#L81-L84 So, care is needed to avoid bad situations (e.g. KT-74931): • A cached classloader can't be re-used if AA or Coroutines are shutdown. • AA or Coroutines can't be shutdown until the classloader is unused. • A cached classloader mustn't be dropped until the shutdown mechanisms have completed.

👀 1

Martin

08/29/2025, 10:18 PM

Looks like KT-73438 is JetBrain internal

fixed 1

thank you 1

Martin

08/29/2025, 10:19 PM

The coroutines is more or less ok-ish I think? It your usual static state problem. If your some of your libs rely on some static state that can be left in a non-recoverable state then it's bad

Martin

08/29/2025, 10:20 PM

The good news is that if you control the code you can make sure

shutdown

is not called

Martin

08/29/2025, 10:20 PM

If you don't well, it's more problematic. 😄

Martin

08/29/2025, 10:20 PM

But I'd be surprised if a library pulls the rugs under my

<http://Dispatchers.IO|Dispatchers.IO>

Adam

08/29/2025, 10:20 PM

I think both of those can be fixed without spawning a separate process. If a separate process helps making this a reality then I'd consider it but it's not a requirement per se.

for my use-case (caching C compilations) the decision to use a separate process was mainly influenced by wanting to use an embedded DB. Using a separate process avoids headaches with merging or concurrent access.

👍 1

Martin

08/29/2025, 10:21 PM

Yup, I hear you. In that case, it's definitely understandable

Martin

08/29/2025, 10:21 PM

It's great that we have solution like this to finally move on that problem of fine grained caching.

Martin

08/29/2025, 10:22 PM

I just wish ultimately we have a no-compromise solution that doesn't require spawning a separate process. I think we can save that

Martin

08/29/2025, 10:23 PM

(thanks for updating that link, I can see it now 🙂 )

Martin

08/29/2025, 10:25 PM

> to workaround having to call

disposeGlobalStandaloneApplicationServices

so as not to leak a worker classloader That sounds like the bigger issue TBH? Static state is dangerous but could be OK (or else we wouldn't have

<http://Dispatchers.IO|Dispatchers.IO>

) but leaking memory is not OK

Adam

08/29/2025, 10:25 PM

The good news is that if you control the code you can make sure shutdown is not called

I think shutdown has to be called, otherwise the coroutines will keep running, preventing the classloader from being gc'ed

Martin

08/29/2025, 10:25 PM

Yea, that's the concerning issue

Martin

08/29/2025, 10:25 PM

OkHttp has shutdown hooks as well except there's also an idle timeout

Martin

08/29/2025, 10:26 PM

If coroutines are not doing anything for <insert configurable amount of time here> then stop all the Executors

Martin

08/29/2025, 10:29 PM

In the very worst case, I think

disposeGlobalStandaloneApplicationServices

could probably be made so that it is recoverable?

Martin

08/29/2025, 10:29 PM

Just store a global static flag

disposed

and restart the machinery next time someone comes in?

Martin

08/29/2025, 10:29 PM

"Just" 😅

Adam

08/29/2025, 10:31 PM

> I just wish ultimately we have a no-compromise solution that doesn't require spawning a separate process. I think we can save that Yeah, I would like to avoid a separate process. I've been thinking about it. In theory, it's possible to merge RocksDB databases. • task1 could open main-db in rw mode. • Then task2 would try to open main-db in rw mode, fail (because it's open in task1), then open it as a read-only viewer. • task2 would open a 'tmp-db'. If a value can't be read from the main-db, then it caches the result in tmp-db. • task2 would complete, and leave an instruction to the current owner of main-db: "Hey, merge this db when you're able /path/to/tmp-db". • task1 watches for 'merge instructions', and merges tmp-db into main-db. • task1 completes.

😮 1

Adam

08/29/2025, 10:31 PM

but that ^ seems more complicated/brittle than a separate process

👍 1

Martin

08/29/2025, 10:32 PM

What's the name of the axiom? Ockham's razor? Simple is better than complex

Adam

08/29/2025, 10:37 PM

grug brained programmer? 😀 https://grugbrain.dev/ "apex predator of grug is complexity"

😃 1

Martin

08/29/2025, 10:38 PM

Bookmarked this page ^^

Martin

08/29/2025, 10:44 PM

Copy code

grug hear screams from young grugs at horror of many line of code and pointless variable and grug prepare defend self with club

yes!!

➕ 1

Martin

08/29/2025, 10:44 PM

I'm curious why grug no like visitor pattern though

🤔 1

Adam

08/30/2025, 6:13 AM

Yeah, I’m not 100% against the visitor pattern, despite what the grugbrain essay says. (In general, I’m not 100% for or against anything in software development.) However, I often think it’s better to encode operations directly in a tree rather than have another side thingie that performs operations over the tree. Sometimes this mixes concerns a bit, but I don’t mind that in this case.

from an interview

thank you 1

Martin

08/30/2025, 8:20 AM

Grug smol braim really answer for everything!

4 Views

Open in Slack

Previous Next