I'm trying to learn about our build performance. I...
# performance
a
I'm trying to learn about our build performance. I have a particular build scan I'm looking to understand better (I don't know if build scans are shareable like that). More specifically I'm trying to understand the task parallelism (or rather the lack of it). Is this a good channel for that?
image.png
I was expecting to see that timeline be more densely packed. It almost seems like there are 3-4 batches of tasks that get executed in parallel, with some slow, long running tasks holding up the execution of some tasks. I don't expect there to be a lot of dependencies between these tasks. I'm wondering if this has something to do with how Gradle queues tasks – perhaps tasks are queued in batches and only re-queued when all the workers have empty queues? I'm possibly reading too much into that chart.
e
Tasks within the same project do not run in parallel, unless they are specifically using the Worker API, or you have Configuration Cache enabled (and it has not been disabled by an incompatible task).
Without configuration cache, --parallel only enables parallelism between projects. That may be why you are seeing less parallelism than you expect.
a
Most of these tasks should be in separate projects. We have 50+ projects in this build, each with a small amount of tasks.
i
One way to increase the parallelization of a build is through modularization. For example, in the build scan you shared, we can observe a bottleneck in the execution of the task
:oss:airbyte-api:server-api:compileKotlin
. Until this task is completed, tasks from other modules, such as
:oss:airbyte-commons-converters
or
:oss:airbyte-data
, cannot start their respective Kotlin tasks, creating contention. Since this task is the third slowest (47 seconds), I would recommend extracting and modularizing the project (
:oss:airbyte-api:server-api:
) to reduce the compilation unit size. This approach would also allow you to benefit from the avoidance savings of caching outcomes. With smaller projects, the likelihood of hitting the cache increases, especially when changes remain consistent. It’s important to note that these observations are specific to the build you shared. In a fully cacheable build, Gradle optimizes parallelization automatically. Therefore, I suggest focusing first on the bottlenecks created by projects and tasks with longer durations and analyzing the project dependency graph to better understand the centrality of project nodes.
thank you 1
a
What's the best way to look at the project/module dependency graph from gradle's perspective?
i
a
perfect, thanks. I'll look into that.