Question regarding working with untrusted files - ...
# cfml-general
m
Question regarding working with untrusted files - as I understand it, the basic flow would be to: 1. Upload to temp directory and rename with guid or other unique identifier 2. Use fileGetMimeType() to verify the file matches an approved list of file types 3. Move to final location (s3 in this case) if valid. 4. Delete if invalid Am I missing anything important?
d
Fred Flintstone here. Is file type/MIME type your only criteria for "trusted"? Not the person who uploaded it? Not that it's not 768 Gigs? I can see you wouldn't want to get involved in looking for evil stuff inside, say, spreadsheets, but at the same time, what exactly does it mean to trust an "untrusted" file?
m
Not sure if this is still a thing, but there used to be a service where you could move a file and it would basically be scanned and approved, and then you could pull it back. UC Davis had this setup with their Office 365 email. ALL attachments went through the ringer. It was able to look into zips & rars.
m
@Dave Merrill good point - yes, a size check would make sense as well - I was primarily asking about potentially malicious files
d
Might the files contain personal info? Sending them out for scanning could be a no in that case.
m
@Mark Takata (Adobe) I was starting to look into security apis for file checking - I wasn't sure the degree to which that was considered a best practice
d
Practically speaking, you can't scan for encoded bad stuff effectively yourself, that's really tricky. Outsourcing that like Mark suggested make sense, if you can afford whatever that costs, and it there's nothing potentially confidential.
m
in this case the files would be limited to images, videos, and pdfs, so zips and more obfuscated files wouldn't be allowed
@Dave Merrill Thanks!
Sorry for the snide tone of that link, not my intention 🙂
🤣 1
m
all good!
b
Also worth noting that fileGetMimeType() can take a file object or a file path (which can be a URL) as its argument. So if you just do something like
fileGetMimeType(form.file)
this can lead to Server Side Request Forgery vulnerabilities and other potential attacks. So you'll probably want to validate that the user is passing in a file object and not a file path
👍 1
a
Don't forget to include something like a csrftoken in your
POST
request as well so you can stop man in the middle attacks etc.
z
FileGetMimeType isn't very coughs through. Lucee only bundles core Tika, so it's not doing that much
IsImage etc are better
m
@zackster thanks for that - will run that for the image files
@aliaspooryorik thanks for that tip - these files will actually be coming in via SMS or Email - so they'll be getting retrieved from a third party (Dialpad for SMS, and Postmark for email)
👍 1
@zackster other than isimagefile, what other verification tags are there for Lucee?
looks like ACF has ispdffile and isspreadsheetfile
b
I'm not sure what the purpose of renaming the file is, so long as it's in a web-inaccessible folder.
@foundeo has some attachment scanning capabilities built into fuseguard if I recall too
f
also something to consider… if you are uploading to s3, you can have the client directly upload to s3 using a signed url
2
👍 1
then the file never touches your CF server
you can even create a policy that only allows certain file extensions and sizes
m
@foundeo better to bypass the CF server entirely? Didn't think i'd be able to check the contents of the file as easily if it were on S3
and while S3 could check the extension, I didn't think that would be reliable enough
b
The s3sdk does by the way-- you can create signed PUT URLs.
f
depends what risk you are worried about
are you worried about end users downloading the file, or are you worried about someone uploading something executable on the server?
m
both 😀
f
for the first risk, you still would want to do some scanning I would think… for the second risk, uploading directly to s3 completely eliminates it
yeah s3sdk would be a good way to go, if you are curious how it works this is some old code that probably doesn’t work anymore (due to updated signature versions) but the general idea is the same: https://www.petefreitag.com/item/833.cfm
m
true - so maybe I work out a flow where it goes to a temp directory in s3, scanning happens there, and then it gets moved to it's final s3 location if all good
f
yup, or your app just doesn’t let anyone download the file until it has been scanned
m
thanks!
breaking it down to the two risk types was helpful/clarifying
👍 1
a
Also worth disconnecting your servers from the internet to be sure
💡 2
🤣 1
m
@aliaspooryorik hahahaha - I should have thought of that - 🤯 brillaint
z
Wrap your server in industrial grade magnets
😂 1
m
and bubble wrap, of course
b
z
But we're all fine with downloading from forgebox aren't we....
I just sold my admin plugins to Avast!
e
Trust nothing, all things are bad, designed by hackers to take your things. For real security, you should go back to stone tablets, armed guards, and pure old-school encryption using a decoder ring found inside a crackerjack box. Once someone submits the proper credentials at gunpoint with a 2-ton stone block, you can move it to a facility that will decode it using the blessed decoder ring. The person in question will be detained until such time the message can be decrypted, its source verified and proper background checks performed for the last 20 years on all the persons' movements, contacts, and expenditures. If everything checks out, you can then accept the request for the cookie for the browser session using a password-protected time sensitive armed, and escorted courier to the interested third party. If they fail to respond properly or in time then carpet bomb a 10KM area around the requestor and detain any survivors for questioning. It's a tough world out there, and it's time to get tough on bad browser requests and potential hackers.