Anyone know of a solid reference on "dangerous", forbidden, or misleading filenames in various operating systems? E.g. names starting with a dot in UNIX-type systems, or COM1.txt on Windows.
I'm trying to sanitize file names for an application that downloads files for the user. (Not worried about actually filtering by file type.)
For example, if the file is presented as "index.html" then I can just save it that way, but I also want to be able to munge "index.html .exe" into "index.html_.exe"
(Alternatively, a library that already does this, preferably in a JVM language...)
Here's what I have so far, but it's really only Linux-focused: https://git.sr.ht/~timmc/cavern/tree/54348ca7411ba4412b845f31b216f17ce50058d1/spelunk/src/main/kotlin/org/timmc/spelunk/Fetching.kt#L444 for the generic case and https://git.sr.ht/~timmc/cavern/tree/54348ca7411ba4412b845f31b216f17ce50058d1/spelunk/src/main/kotlin/org/timmc/spelunk/OS.kt#L112 for the Linux-specific part.
I usually have the file stored as it's hash value (no extension) and then store the file info in a JSON document.
@pym Yes, but that makes for a poor user experience if I want them to be able to open it directly, copy it somewhere, etc.
A Mastodon instance for info/cyber security-minded people.