Can FR tell me any details about ColdFusion's temp...
# fusion-reactor
d
Can FR tell me any details about ColdFusion's template caching? We're running a fairly large 24x7 app on CF2018. Every so often, seemingly nonsense crashes start happening on a page that hasn't changed in a while, errors like
Element EXECUTIONMODE is undefined in THISTAG
and
Could not find the included template
on pages that have run many times for weeks without problems and haven't been modified since. Clearing the template cache from CF Admin fixes it, instantly, haven't run into a case where it didn't. Trusted cache and Save class files are both unchecked, and have been for a long time, and I cleared out the cache directory manually a while ago too. We're not dynamically writing pages to disk, or using the in-memory file system. It's mostly old- or semi-old-school code, cfms that call CFCs for data, business logic, and sometimes rendering, some common includes, the usual, straight-up ninja-free stuff. Clearing the template cache doesn't change the code but fixes the problem (until next time), which on the face of it seems to say the problem isn't our code, but some kind of foobar in the CFML engine itself. I've raised a ticket with Adobe, but without steps to reproduce it's hard for them to get any traction. How can I get FR to give me maximum details about the state of CF's template cache? I looked at every FR metric I could think of around the time of the last crash, didn't see any spikes or obviously bent stuff, but (hopefully) I just don't know what to look for. One theoretical possibility that matches the symptoms is that something is interrupting the caching process while it's writing, but I don't know how to get a handle on what might cause that. I don't even know why template caching is in play at all, when it seems to me shut off in cf admin. Any thoughts are most welcome. (The crashes I know about were caught by our error handling, so they don't show up in FR, but the crash dumps we capture just show specific errors like the ones above, nothing directly related to the template cache.)
b
To be clear, the settings you disabled do NOT disable the in memory cache of class files. You wouldn't want to disable that, it would be terible
You just disabled the cache on disk
I just started up an ACF 2021 server in CommandBox and FR's graphs are all empty on that page after loading a few pages so no idea how much you can trust the graphs šŸ™‚
@Dave Merrill
When I start an ACF 2018 server, I get values in the graphs.
@mflewittintergral Does FusionReactor not support CF 2021 for those metrics?
@Dave Merrill I'll also add here that the default limit to the in memory class file cache is 1024 which isn't really that big
Basically, if you have more than 1000 files in your app, you're probably filling that up. I would recommend increasing it just in case your issue is related to templates getting re-compiled on the fly because they were evicted from the cache in the first place.
Honestly, if you have a heap of several Gigs, you could probably increase that cache size by several fold with no issues
d
Thanks Brad. Really really roughly, there are 30k files, but that includes some non-cf and some old obsolete stuff that's never called. I looked at the template cache details before, and like I said didn't see anything unusual around the time things went sideways -- no huge peaks or drops, not sure what else to look for. Wasn't a super busy time of day even. Symptoms and functional cure imply something is messing up the template caching process, but how to I identify that more specifically, and more to the point, prevent it? In some ideal world I'd be able to identify when a specific template was last cached, but I'm not sure how that'd even help.
(copying from other channel), with FR-specific addendum.) To be clear about the symptoms: • This happens on multiple different pages, but it's a quite small subset of the pages in the site. • Once it starts happening on a particular page, that page will throw whatever error it's throwing a lot, probably every time it's called, though I can't be certain if it's actually EVERY time. • The error thrown on a given page in a given episode is the same. That matches the hypothesis that the template cache for that page has gotten corrupted. So does the fact that clearing the template cache fixes it, every time to the best of know knowledge. • Other pages will still be ok. I appreciate the brainstorming, but it seems curious to me that port exhaustion would manifest like that. It seems much more like corruption of the in-memory template cache, and I still don't get how port exhaustion could cause that. @mflewittintergral Can FR give us any insight into the in-memory template cache? In particular, can it tell us anything that could help with any of the following: a) verifying that the template cache is in fact corrupted b) show us something about the nature of the corruption, for example, it's truncated, overwritten with unexpected data, etc c) help track down why it's happening, and ideally, how to prevent it I very much appreciate any insights here.
b
@Dave Merrill Your questions about looking deeper into the template cache to see what's there and how long its been there are really great questions, but honestly a dead end IMO. You'll never get those answers. If you spent enough hours fiddling with Java reflection, you may find a way to poke at the template cache and monitor it, but there's no promise you'll find anything helpful. That's why my first suggesting still remains my current suggestion, which is to increase the size of the template cache so it never has to evict any items from it and see if that helps.
Mikey is free to answer as well, but your specific questions about "can FR do a, b, c..." are pretty much "no, no, and no" lol
All I'm aware that FR grabs is the size of the cache and that's it. If the bytecode of a given class is somehow messed up, there's prob no way to tell
This does bring up something interesting though, which is the remote possibility of FR's byte code instrumentation messing up the class on the fly. šŸ¤” @mflewittintergral is there a way to simply disable ALL instrumentation of bytecode in FR to test?
@Dave Merrill If this is something that happens with often-enough regularity that you can know for certain it will happen on a given server with X number of days. You could uninstall FR from a server for X days just as another test.
m
The cache size metrics from CF are currently not available as they were removed from the Adobe metrics API, but we have manually instrumented them in the next FR version to get them back. You can disable some instrumentation in the settings under the enable / disable menu, but not all. As far as I am aware FR has no involvement in the template caches of CF or lucee
If you know it happens in a set window of time. It's a dirty fix, but I guess technically you can clear the template cache programmatically over the admin API object at a shorter interval than the issue occurs
I can only really see us having byte code manipulation in an actual CF template if you have the line tracking feature on, otherwise I don't see what in those templates we would cut into
šŸ‘ 1
d
@bdw429s Good observations, thanks. Really really roughly, we're seeing episodes of this maybe twice a month, not that regularly I think, a bit hard to cross-tabulate them in the no-time I have right now. I suppose <cfcache action="flush"> in a scheduled task overnight is worth a shot. @mflewittintergral Is that the "API" you were referring to, or something more specific?
b
The adminAPI supports it
But I'd still recommend just trying to increase the cache size first
You're dealing with a LOT of unecessary compilation with the default setting! I'd recommend that even if you weren't having issues