CF 2018 on Win 2016 I wrote a app that loads .xls ...
# cfml-beginners
e
CF 2018 on Win 2016 I wrote a app that loads .xls files into Oracle database (and doing something else in the process). I have to convert .xls file into .csv in order to do so. My code breaks when user deletes lots os entries from the original .xls file, which results in the following:
Copy code
CUSTOMER_ID,DIVISION_ID,STYLE_ID,QUANTITY
AMAZON,YMS,Y15CXWE,31
AMAZON,YMS,Y15CXWE,85
,,,
,,,
,,,
,,,
Question: how can I remove lines with commas ONLY in them?
a
Question, just for clarification: would it be better to remove them, or would it be viable to not put them in there in the first place?
Also: is the context that you have that data you show in a string? Or is it in a .csv file?
d
Depending on your strategy to convert xls to csv there should be a setting to ignore empty rows, if there isn't maybe look at a different library. Without actual information about how you are doing the conversion there isn't much people can do.
a
Yeah I wanna make sure the right question is being asked before giving an answer...
d
I have seen the exact issue he mentions when xls files say there are 100 records when there are only 20. Excel is good that way. So the question makes sense, but still not enough detail.
It is a very skin the cat kind of problem. Off the top of my head solutions are 1. find the setting in your library to ignore empty rows. Hope your library includes rows after an empty row if there are some. 2. Simply run a regex over the csv string that looks for new line, only commas or only commas and whitespace, end of line and replace it with nothing. 3. As you process each row if first column is empty and that column should never be empty skip it. What you choose really depends on the requirements of the document and wether users may remove data from rows in the middle and if that is still considered a good doc and what the library you are using does.
e
I am reading in .xls file into a query object and writing it out as .csv
Copy code
<cfspreadsheet action="read" src="#xlsFile#" query="qryExcel">
Copy code
<cfoutput query="qryExcel">
                <cfset line = "#col_1#,#col_2#,#col_3#,#col_4#,#col_5#>
                <cffile action="append" file="#csvFile#" output="#line#">
            </cfoutput>
m
not the best method of doing this but if you are doing it this method <cfif trim(col_1 & col_2 & col_3 & col_4 & col_5) neq ""> do your line appending....
e
I am not sure what would be the better approach as far as not writing empty lines OR removing them afterwards
Thanks Michael, I will try that
m
So first question would what is importing the Excel File into Oracle if it is ColdFusion why are you going from XLS -> CSV -> Oracle why not just XLS -> Oracle. Does Oracle have a special import CSV function? I have had customers send me 50MB excel files that are only four rows long because how excel likes to save the empty rows. Manually converting the data like this without any sanitization can be really error prone. Doing double work on a file can be really processor and error prone.
d
<cfspreadsheet>
has a
format="csv"
option if you're only using it to do the conversion, e.g.:
Copy code
<cfspreadsheet action="read" src="#xlsFile#" format="csv" name="csvdata">
<pre><cfset writeOutput(csvdata)></pre>
That will exclude any rows that don't have data in them.
e
Yes, it's possible to load .xls file into Oracle, but it's more complicated than using external table to load .csv
David, your suggestion does not work. It displays all commas in .csv output
Michael, it looks like approach you suggested worked.
Thank you all for your help
for those who interested, I've done it this way:
Copy code
<cfoutput query="qryExcel">             
                <cfset line = "#col_1#,#col_2#,#col_3#,#col_4#">
                <cfif TRIM(REReplace(line, ",", "","ALL")) NEQ "">
                    <cffile action="append" file="#csvFile#" output="#line#">
                </cfif>
            </cfoutput>
d
When you say my method "displays all commas", do you mean it's displaying empty rows as commas, like in your first example? If so, you can get rid of those empty rows with
reReplace
. You can replace all of the code in your above snippets with this:
Copy code
<cfspreadsheet action="read" src="#xlsFile#" format="csv" name="csvdata">
<cfset filewrite(csvFile, rereplace(csvdata, "((^|[\r\n]+),+)+(([\r\n]+)|$)", "\4", "all"))>
That's a bit more elegant, imo. And, as an added bonus, it automatically puts quotes around values with commas so they won't be interpreted as additional columns.
👍 1
e
Thank you David
👍 1