So I've recently discovered the Apache POI Library...
# cfml-general
h
So I've recently discovered the Apache POI Library. I've a project where I'm trying to modify text in word.docx documents. I went to chatGPT to see if it had suggestions, which it did. The suggestions don't seem to be working. I've downloaded the following jars and added them to the lib/ folder in my Lucee server/WEB-INF/blah/blah/lib folder. • poi-5.2.3.jar • poi-examples-5.2.3.jar • poi-excelant-5.2.3.jar • poi-javadoc-5.2.3.jar • poi-ooxml-5.2.3.jar • poi-xml-full-5.2.3.jar • poi-ooxml-lite-5.2.3.jar • poi-scratchpad-5.2.3.jar I restarted Lucee and tested. The only library that hasn't thrown a fit is the poi jar. I load the jars like so:
Copy code
poi_library = createObject("java","java.io.FileOutputStream")
  //WPF_document = createObject("java", "org.apache.poi.xwpf.usermodel.XWPFDocument")

  dump( poi_library )
The dump of the poi_library is the only thing that works at this point. When I uncomment the WPF_document variable and try to dump that I get an error. The error message reads: org/apache/commons/io/output/UnsynchronizedByteArrayOutputStream Interestingly, if I don't try and dump the variable, no error happens. I've been at this for about a day and the list of errors grows as I try to add more java classes. My hope is that someone out there has worked with the POI library successfully and can steer me in the right direction, or can post a link where someone explains the proper procedure to use the library with Lucee.
1
d
The cfspreadsheet project and I think some of the open-source cfdocument-ish stuff uses POI, and there was a cfpoi lib at some point, but I'm not sure what the most active one is beyond whatever is on github: https://github.com/search?q=cfspreadsheet&type=repositories (one of those is the current on I see on forgebox.io I believe) It almost always comes down to a conflict with the XML libraries historically, but all those projects should somehow work around that, if it's indeed even still an issue. You can post what you're running into (actual errors and whatnot) and we can try to help you figure out what is happening or you could also try installing the lucee extension for cfspreadsheet and see if you can piggyback on it somehow. FWIW you might not need all those libs, so to start with you might want to figure out what the bare minimum is versus adding them all out of the gate, perhaps looking at those other projects and seeing what all they include.
b
Is there more information about the error message? My first impression is that your set-up needs an additional dependency. Apache Commons IO perhaps? For example, unzip the binary download, copy commons-io-2.16.1.jar to your lib directory, and see what happens.
c
The spreadsheet-cfml library uses the latest POI and provides a method allowing you to easily call any of the bundled java classes: https://github.com/cfsimplicity/spreadsheet-cfml/wiki/createJavaObject
❤️ 1
As you are using Lucee the jars will be loaded via OSGi which means you shouldn't get any conflicts with other versions you may have on the class path.
h
Notice the message org/apache/commons/io/output/UnsynchronizedByteArrayOutputStream As I was looking through all the jars that came with the POI-5.2.3 binaries I saw the very jar you referenced all be it the version number was 2.11.0. I did add that to my Lucee/lib folder. With no change. Question for anyone. The poi-bin-5.2.3 binaries that I downloaded has three sub-folders: Auxillary lib oosml-lib Do I go into the individual folders and copy the jars into Lucee/lib folder, or do I copy the folders themselves into the Lucee/lib folder?
typo: Not oosmls-lib but ooxml-lib
Alos, I'm not an experienced Slack user. Should I keep this conversation limited to this thread, or is it something I should share in cfml-general?
d
Did you add the commons\.io jar? Often what I'll do is a little sanity check on the libs, using something like this:
Copy code
#!/bin/sh 
find "$1" -name "*.jar" -exec sh -c 'unzip -l {} 2>/dev/null |grep -H --label {} '$2'' \;
so if you saved that as
findclass
you would use it like
./findclass ./libs UnsynchronizedByteArrayOutputStream
(assuming the libs are in ./libs from where you are calling it, otherwise you'd put in the path) this should work on unix-like systems (mac, linux)
we like to keep things organized in threads, so here is the place to discuss it!
It does make it kind of hard to reply to other replies (and while I'm moaning— how is the code insertion still so bad in Slack? but I digress. 😁)
h
I found the commons-io-2.11.0.jar in the poi-binary/lib folder and copied it to the Lucee/lib folder. I did nothing to add it to my cfml code, howerver.
and yes I agree. The code insertion is frustrating. . . .
d
You shouldn't need to add it to the cfml code since it's in the main classpath (tho this is also a recipe for class version conflicts, regardless of OSGi)
I'm guessing adding it didn't solve the problem?
h
The error that I posted above comes from the following code snippet:
Copy code
FOS = createObject("java","java.io.FileOutputStream")
  WPF_document = createObject("java", "org.apache.poi.xwpf.usermodel.XWPFDocument")
  // WPF_paragraph = createObject("java", "org.apache.poi.xwpf.usermodel.XWPFParagraph")
  // WPF_run = createObject("java", "org.apache.poi.xwpf.usermodel.XWPFRun

  dump( WPF_document )
No it did not
d
One solution is to follow the instructions for installing this: https://github.com/cfsimplicity/spreadsheet-cfml and then once you can verify that it is working, we should be able to get the WPF working
I would remove all the libs you added to the main libs dir first tho
With the spreadsheet-cfml code you get to keep the libs for POI separate, which is really the way to go.
h
I'm assuming that the cfsimpilicity-spreadsheet-cfml also works with Word documents?
d
Yep, theoretically, and worst case you would add some libs to the spreadsheet lib dir (versus the main lucee ones)
1
h
Okay, I'll work on getting tht done.
🍀 1
d
You could try it on a fresh instance pretty easy using commandbox, which is an excellent tool for doing any manner of CFML work
I'd set up a separate little project to test the stuff on (and use some tests) for easy isolation
c
Denny's quite right, although with Lucee (as opposed to ACF) the lib jars are loaded from an OSGi bundle (
lib-osgi.jar
) rather than from the
/lib
dir, which makes it harder to add jars. But you can easily configure the library to use JavaLoader which does load from
/lib
and also provides good isolation (this is what happens by default with ACF). I've only worked with the spreadsheet side of POI, but I suspect much of the other functionality should work with what's already bundled, which includes
commons-io
. Do make sure you use the
CreateJavaObject()
method I mentioned though. That will ensure the classes are invoked from the right place.
Here's a very simple example of creating a Word doc. No additional jars were needed.
Copy code
//Instantiate the library
spreadsheet = new spreadsheet.Spreadsheet() //adapt to where you have put the library
// Create a Word docx object
document = spreadsheet.createJavaObject( "org.apache.poi.xwpf.usermodel.XWPFDocument" )
// Add some text
para = document.createParagraph()
textRun = para.createRun()
textRun.setText( "A programmatically written Word doc!" )
// Save to a Word file
filepath = "c:/temp/word.docx" //adapt to where you want to save the file
try{
	outputStream = CreateObject( "java", "java.io.FileOutputStream" ).init( filepath )
	document.write( outputStream )
}
finally{
	outputStream.close()
	document.close()
}
❤️ 1
word.docx
d
Awesome!
Teamwork makes the dreams work! =D