Just tracked down ish an interesting problem were running co cfml #docker-commandbox

Just tracked down(ish) an interesting problem - we...

dougcain

03/02/2023, 12:40 PM

Just tracked down(ish) an interesting problem - were running commandbox in AWS fargate containers using pipelines to deploy. They recently started failing to deploy with an out of memory error. The container has memory limit of 4Gb and commandbox acf 2018 is given 3.5Gb via server.json jvm->heapsize. This has worked fine but it seems ortussolutions/commandbox:3.6.4 want’s more than 500Mb for the kernel to run - if I reduce the heapsize to 3G it’s ok, 3.6.3 didn’t seem to have this issue. Any ideas whats changed in 3.6.4 that makes it want more memory for the base OS? It may be that my assumption that 500Mb is enough for the OS in a container is wrong though, anyone have any thoughts on how much headroom a container should have outside of the application running?

bdw429s

03/02/2023, 3:27 PM

@dougcain What does top show inside the container as far as what processes are using how much memory?

dougcain

03/02/2023, 3:31 PM

not easy to see, as the container is deployed in fargate it spins up, the kernel notices not enough memory and kills the process which stops the container. I can see the std out logs which show command box starting etc. and the cf server becoming available so all looks normal. I “may” be able to connect to the container as it spins up and before it stops but its a very short window

bdw429s

03/02/2023, 4:46 PM

Can you take the image and run it on docker locally?

dougcain

03/02/2023, 4:47 PM

yes that all works fine - will put the same contraints aws has on and see what it does

dougcain

03/02/2023, 4:49 PM

its odd as I didn’t see much of a difference between the 3.6.3 and 3.6.4 when testing just noticed this odd deploy behaviour after AWS was updated with it. Tracked it down to AWS returning the out of memory error. Anyway will look deeper on a local version running docker itself - AWS runs it’s own thing under fargate so not much chance of replicating that 🙂

dougcain

03/03/2023, 10:34 AM

try as I might I can’t replicate it in my local docker - will see if I can get further with AWS

bdw429s

03/03/2023, 3:11 PM

@jclausen

jclausen

03/03/2023, 4:00 PM

@dougcain Is this a Coldbox app and, if so, what version?

dougcain

03/03/2023, 4:09 PM

@jclausen our app is in two halves currently one is coldbox based and the other isn’t yet (migrating to it). In both cases running as seperate containers they seem to do the same thing. Whats frustrating is I don’t get an easy way to view the container as AWS deploys it before the kernel kills it. The logs look normal from command box which come out in cloud watch. I seem to have it working by setting server.json with a minHeapSize of 1G, 3G heapsize which in turn is currently 1G less than the container hard memory limit of 4G. I was also forcing a box update in my container build which I have taken out - so may have been something experimental in there. My main question though is if a container has a hard memory limit what kind of headroom should it have from the jvm heap size

jclausen

03/03/2023, 5:03 PM

The big difference in the 3.6.4 version of the image is the version of Lucee used. How much non-heap memory your application uses is dependent on a lot of factors. The number of threads in the container makes a significant difference on the non-heap memory required. For example, if you are on Coldbox 6.5 and above, Logbox and the Coldbox schedule listeners add 20 threads on application startup. Plus you have the Lucee threads. Then if you have any kind of APM java agent, or use one of AWS’s, that is going to add threads and take up non-heap memory. How much heap is immediately committed is determined by whether you supply an

-Xms

argument. If you use just

jvm.heapSize

in your

server.json

then that is the max but no min is set. So the heap starts low and then grows. The min heap also has an impact on every thread, if you set that argument. Here’s an example of a Forgebox container, which has a max heap size of 1.5G, no min, for a lifecycle of 1 hour. You can see the non-heap space grows over time, as do the number of threads. These are coming from multiple sources - Lucee, Coldbox, Custom Scheduler threads, Elastic APM, etc. Based on these metrics, I wouldn’t feel comfortable running this application with a memory limit under 2GB. We set the container to a to 2.5 limit, in production to be safe.

dougcain

03/03/2023, 5:05 PM

Thanks @jclausen very helpful

jclausen

03/03/2023, 5:06 PM

Coldbox versions greater than 6.5 also have to have an

onApplicationEnd

method in the

Application.cfc

( see the coldbox templates ) to make sure those scheduler threads are not orphaned when the Application restarts. Either that, or you should set a

CFCONFIG_APPLICATIONTIMEOUT=365,0,0,0

environment variable to ensure that application never restarts/orphans threads

Open in Slack

Previous Next