Platform-agnostic "valid file name"?
annevk opened this issue · 7 comments
https://fs.spec.whatwg.org/#valid-file-name seems rather bad, especially for just OPFS.
cc @whatwg/fs
I assume the issue here is the "path separator [] on the underlying filesystem". It's problematic for this allowed filenames to depend on the platform; if we want to block \ on windows, we should block it everywhere - for OPFS.
Of course, you can have real files with \ in a Linux/Mac FS IIRC... which shouldn't matter for OPFS. It does matter for non-OPFS uses, which I imagine is why it's written this way. However, I think for OPFS it should be unified and block .
I don't know how easy it is for there to be differences between OPFS and non-OPFS uses (which we do/will not support in any case).
Yeah, I would prefer avoiding platform-specific behavior if we can, and it seems doable here.
Yeah I agree the we should be more specific about the restrictions for the OPFS, and they should probably be more lax than the restrictions we have for local files. This is one of the big reasons I'm proposing not to allow move()
to support moving files from the OPFS to the local disk (see a-sully#2).
It seems to me we have two options:
- Support the "most-restrictive-denominator" (thanks for the phrase, @jesup) to guarantee that the file names are compatible with all platforms
- Assume (or explore making explicit?) that user agents won't store OPFS files "as-is" on disk, but will have a mapping of "web-exposed name" -> internal name. In which case we can be much more lenient in what files we allow to be stored (something like web-platform-tests/wpt@40090b3)
I'm okay with either option, though if all user agents are storing this mapping anyways then I think option 2 makes more sense (and the OPFS doesn't have to change if the underlying platforms do)
We'll need some way to add further restrictions for non-OPFS files. Perhaps we could do something like what we do for permissions, i.e. "run this check" is an algorithm that can be set depending on how the handle is acquired?
See WICG/file-system-access#399 (and please let me know if there's a different approach you recommend in that case or this one)
Looking at this again, there's two distinct conversations about allowed characters:
- Should
\
be blocked on just Windows, or everywhere? - What other characters, if any, should be blocked?
For the former I think you're probably right that we should block both /
and \
the OPFS on all platforms
I was mostly talking about the latter in my comment above. We could adopt something like https://www.w3.org/TR/international-specs/#file_naming to ensure files could be stored as-is on the underlying platform (option 1 above), or not (option 2 above)... I know I'd suggested option 2 above, but looking at this again, it seems like it would be nice to be consistent with other parts of the platform and just use https://www.w3.org/TR/international-specs/#file_naming. Thoughts?
Do we actually know of a web platform feature that enforces https://www.w3.org/TR/international-specs/#file_naming? That document is not a normative document so is not suitable for reference.
E.g., https://w3c.github.io/FileAPI/#dfn-file only prohibits lone surrogates. \
works: https://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0A...%3Cscript%3E%0Aw(new%20File(%5B%5D%2C%20%22%5C%5C%22).name)%0A%3C%2Fscript%3E
According to https://wicg.github.io/entries-api/#names-paths:
A name is a string which:
does not contain '/' (U+002F SOLIDUS)
does not contain NUL (U+0000)
does not contain '' (U+005C REVERSE SOLIDUS)
is not '.' (U+002E FULL STOP)
is not '..' (U+002E FULL STOP, U+002E FULL STOP)
A path segment is a name, '.' (U+002E FULL STOP) or '..' (U+002E FULL STOP, U+002E FULL STOP).