Does anyone have a good argument for not allowing ...
# cfml-general
d
Does anyone have a good argument for not allowing apostrophes (single quotes) in file upload names? I feel that apostrophes should not be used in filenames, but can’t find any definitive rule that tells me if this is valid or not. I have a system that restricts file upload names to be limited to \w characters, spaces, and hyphens, but some of my clients who use the system would like to able to upload files with apostrophes in the filename. Is denying apostrophes just a bias that I have or can someone point me to a valid reason to not allow them?
a
I would say "don't make up your own rules, there are clearly codified rules in the industry already to answer this question. Stick to those" would be the answer to this 99% of the time (the other 1% reserved for "it depends"). Summary: https://en.wikipedia.org/wiki/Filename#Reserved_characters_and_words For this sort of thing I "never" second-guess ppl in the industry who are clevererer than I am. I find the relevant RFCs (or equiv), and I stick to those.
Find the lowest common denominator of compat issues for the file systems you need to support, and support that. Also: file systems will already have an approach to dealing with characters they don't like / can't use, so leave that to the file system to deal with.
p
Code injection risk but that is partially dependent on your backend code being built properly to prevent but Adam has the sturdier explanation that a user might follow
d
Thanks, Cameron and Patrick. I guess that it is just my old DOS days conventions showing themselves. I appreciate your input.
m
I wouldn't restrict anything in what the end user names the file for uploading, but we do actively rename the files when the uploaded file is being retained in our file system. Our only super important rules are we upload into quarantine, which has no web accessibility, then we 'do stuff' to the files to ensure it behaves as expected before moving, and the move process is renaming it. The we 'do stuff' are things like reading it to check the data or running through image magick and auto orienting/stripping exif/scaling.
d
I do upload outside the web root and rename the file as as UUID plus the original file extension prior to saving the file to the system (I save the original filename in a database for renaming the file on download with cfcontent). However, I have run into issues in the past with ColdFusion not being able to read the original tmp upload due to a filename containing a colon. Because of this, I couldn’t rename it and the upload would fail.
r
Blame the operating systems for allowing people to name their files with insane names. @Dean Lawrence brought up an important point. If you allow people to do anything the want the server will reject the file to start with. It won't even get uploaded to the temp folder. I would love to show you some of the insane file names people create. Unfortunately, because this chat is being publicly exposed now I don't want identifying information in the file names for users being indexed by Google. Definitely colons are a main one to stop on windows servers.
a
@Patrick def should not be using a file name in a way that might expose any sort of injection. So like always pass as a param to the DB; always encode when outputing in mark-up, etc. But one should not monkey with the base value, one should just make sure it's not dangerous when being used.
p
Yea that's what I was saying haha
d
@Adam Cameron But again, there are some characters (primarily the colon) that will cause you to not even be able read the file upon upload, making it impossible to process. So it makes sense to limit to some extent what can be uploaded, even if it is minimal. And yes, once uploaded, additional verification and procedures should be followed to prevent malicious activity.
a
I was just making sure you didn't mean it should be illegal to have a file called
INSERT INTO someTable (someColumn) VALUES ('0x003Cscript0x003Ehijacked()0x003C0x002Fscript0x003E').txt
. That's a legal file name, and no intrinsic reason to not allow it. (that's not the right way to encode
<
,
>
and
/
in this situation, but you get the point I'm trying to make)
d
No, I understand what you are saying and the extremely valid points that you are making. I am just saying that doing no validation prior to upload can also lead to issues as well that have no malicious intent. If I can save myself for getting a call at 8:30 PM on a Friday night from a client that is trying to upload a file with an invalid character in the name, I will. Again though, thanks for all your input, it is very much appreciated.
a
I would try to write the file... that is the proof in the putting that it's not a problem. If it fails bounce the error message (or some sanitized proxy thereof) back to the client. I'd be surprised if the web server writes the file with its original name in the upload dir... this would only come into play once your CFML code is rehoming it at its final destination.
r
@Dean Lawrence I just did a try catch around the upload and the error message you get specifically mentions invalid file name or something close to that.
d
@risto Can you share the filename that you tried using to throw the error?
r
Copy code
@Dean Lawrence This errored for bad characters (don't know if was both colon and comma or just colon)
name:morename, Outtake (2021-2022), %22Swivel%22 + %22Munch Blur%22_xj1acgimtiga.pdf

This errored for too long length believe it or not:
Terrono-Performance Political Discourse and the Problematics of the Confederate Flag in Contemporary Art-1-1 (1).pdf

I had way crazier but I'd have to go back and do some deeper digging to find some:)
@Dean Lawrence I'm a big fan of renaming files as long as they can make it to the server in the first place:) I think I mentioned I'm using Lucee latest and windows server 2019 on that machine. https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file?redirectedfrom=MSDN#maxpath
I put an alert above the upload saying don't use the following reserved characters (the ones in the article above) in your filenames and nobody cares 😅
d
@risto Thanks! I appreciate the extra information.