I want to learn about how the `cfml` engines (Luce...
# cfml-general
e
I want to learn about how the
cfml
engines (Lucee or ACF) work under the hood. How they process things, and how the servlet container facilitates them. I have googled but found nothing specifically on this topic. Please help me with guidance.
m
Have you checked out the Lucee source code? https://github.com/lucee/Lucee
e
yes but I am new to coldfusion and looking for documentation or books where I can learn how the whole process takes place. Sorry, but I am not ready to understand the source code yet.
m
Someone correct me if I'm wrong, but I don't think those books (or even docs) exist. You could try on the Lucee dev forum (https://dev.lucee.org).
e
okay understandable and thank you. Btw, can you please tell me what actually happens when a request comes to lucee. How the cfm or cfc files get executed, are they compiled into byte code before the server starts?
g
they are complied on the fly
with the first execution
❤️ 1
a
JIT compilation is how CF apps are typically deployed, however if you really need to you can pre-compile and then ship as a WAR etc. You may want to do that so you have sourceless distributions for example.
❤️ 2
b
This is a pretty broad question. Do you already have an understanding of how a Java servlet/WAR works?
1
e
@bdw429s yes I have some understanding how tomcat (coyote, catalina, jasper) operates but interested to understand more in details in terms of CFML
@aliaspooryorik So you are saying, when request for a specific cfm or cfc reaches to lucee servlet it just compiles only that and respond?
a
Yes, it compiles on the fly it it hasn't already been compiled.
b
There used to be a nice blog post many years ago that gave a run down of this from one of the CF engineers. Like prolly back in the 2008 days-- I doubt the blog still exists
Lucee, BoxLang, and Adobe CF have their own peculiarities, but the servlet match is turned into a physical path to a template disk. There is a disk cache of class file and an in-memory cache as well. templates which have not been compiled yet, are JIT parsed and compiled into bytecode and loaded on the fly via a custom URL class loader, where they are then invoked. The plumbing of the servlet is abstracted via scopes, BIFs, and tags. Or, in the case of BoxLang, the servlet is completely optional and only present in specific web runtimes.
The servlet wrapper is quite simple, at least in BoxLang. I wrote it in an afternoon as it was just a matter of stubbing out the web.xml template that maps .cfm and .cfc files to the CF servlet https://github.com/ortus-boxlang/boxlang-servlet/blob/development/src/main/resources/boxlang-servlet/web.xml And then the servlet class (which is class loaded and invoked by the servlet container) simply maps the incoming URI to the physical template and sets up the request. https://github.com/ortus-boxlang/boxlang-servlet/blob/development/src/main/java/ortus/boxlang/servlet/BoxLangServlet.java#L109-L124 The logic in the
init()
runs once when the servlet context is created, and the
service()
method is run for every incoming request
❤️ 1
1
Once the path to the template is resolved and we can calculate what the Java class name will/would be for the compiled template, then it's pretty easy to check the memory and disk caches for class files (which are loaded via the CL's
defineClass()
method. Lucee, BoxLang, and ACF vary quite a bit in their approach to generating bytecode, but generally speaking, a single CFM file translates to a single Java class where the logic in the template is represented in the bytecode of a method which is invoked via a shared interface or superclass. Lucee will compile UDFs (be it from CFMs or CFCs) into a single class. ACF and BoxLang will use inner classes for each UDF instance, but that's generally an implementation detail hidden from the developer.
1
All 3 CF engines use a "context" object that represents/wraps the underlying HTTP request details and contains the scopes of variables for that request (URL, form, CGI, etc) as well as references to the session and/or application contexts. BoxLang uses a series of contexts in a parent/child relationship to contain the scopes, making the language core small but modular and extensible whereas Lucee/Adobe mostly know all possible scopes ahead of time. Internally, this context is passed pretty much everywhere (and usually registered in the thread local) so the runtime always has access to the variables on the page.
1
Instead of compiling straight to the same Java constructs of the same name, all 3 CF engines provide their own implementation of • variable definition (via scopes) • classes and type checking • operators (string, math, binary, logical, etc) • flow control constructs (if, do, dowhile, for loop, index loop, switch, etc) So CFML code such as
Copy code
// CFML code
if( foo && name == "brad" ) {}
will compile down to the bytecode equiv of a Java if statement, but the actual operators will be replaced with classes that implement the "and" and "equals" operator. e.g.
Copy code
// Java code equivalent represented in bytecode
if( AndOperator.invoke( BooleanCaster.cast( foo ), EqualsOperator.invoke( context.getVariable( "name" ), "brad" ) ) ) {}
☝️ Pseudo code-- the CF engines differ a deal here, but accomplish the same thing at the end of the day. This allows them to enforce what can be cast to a Boolean, implicit type coercion, variable access, and logical operator behavior.
❤️ 2
@eniac00 Let me know if you have further questions
d
You're a(n inter)national treasure Brad!
😁 1
🤘 1
❤️ 1
1
l
@bdw429s that was great - thanks! This should be converted to a blog post before it becomes lost ephemera...
e
@bdw429s Thank you so much Brad for your thorough explanation, it actually means a lot to me. Can you please check the flow if I have understood it correctly?
Copy code
1. the incoming request -> servlet match -> physical path  (happens through servlet class)
2. if it is the first run then servlet class 'init()' method is called else 'service()'
3. searches if templates are in the cache
4. if not in cache then JIT parse the template into java class and then compiled into byte code (each cfm or cfc converts into single class and all UDFs are in one class)
5. byte code loaded on the fly via a custom URL class loader and gets invoked()
Please let me know if there are mistakes or there are more explanations to be added. Thanks again and God bless you.