Rootless file URI does not round trip
Closed this issue · 10 comments
RFC-3986 talks about rootless path URIs.
I interpret this to mean that a URI of file:relative/path.ext
is a perfectly reasonably URI indicating a relative path. Unfortunately this does not round trip:
>>> u = uri.URI("file:relative/path.ext")
>>> u
URI('file://relative/path.ext')
>>> u.path
PurePosixPath('relative/path.ext')
>>> u.uri
'file://relative/path.ext'
>>> u2 = uri.URI(u.uri)
>>> u2.path
PurePosixPath('/path.ext')
>>> u.hostname
>>> u2.hostname
'relative'
So it turns a relative URI into an absolute URI with a hostname. Am I misinterpreting RFC-3986? (I noticed this because I have some relative paths that I need to represent as URIs).
This may very well partially be a duplicate of #9. Please reference this comment, specifically. I also note that your MCVE (test case) does not ever actually define the variable u2
, making it incomplete.
Edited to add relevant ABNF:
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
hier-part = "//" authority path-abempty
/ path-absolute
/ path-rootless
/ path-empty
URI-reference = URI / relative-ref
absolute-URI = scheme ":" hier-part [ "?" query ]
relative-ref = relative-part [ "?" query ] [ "#" fragment ]
relative-part = "//" authority path-abempty
/ path-absolute
/ path-noscheme
/ path-empty
Sorry about the lack of u2
-- I've fixed the original message (copy and paste error).
Reading the comment referenced above in #9, I agree with the last line of the first paragraph (it doesn't round trip properly). The final comment on #9 seems to be that file:relative/path.ext
is forbidden by the RFC and is not in fact described by the path-rootless
section of the documentation. The odd thing is that this package is parsing them correctly, but not putting them back together consistently.
Thank you for the updated test case; I'll get this integrated into the test suite proper and make sure roundtrips behave sanely. Specifically, examining the BNC closely, it appears that without the //
marker, there is no authority part, vs. URI's current behaviour of treating everything up to the first /
(after the first :
, ignoring //
) as authority.
As per the notes (and "final weigh-in") from this comment on #9, and the RFC specification for the file:
URI scheme (BNF copied below), "relative" paths are not permitted.
file-URI = file-scheme ":" file-hier-part
file-scheme = "file"
file-hier-part = ( "//" auth-path )
/ local-path
auth-path = [ file-auth ] path-absolute
local-path = path-absolute
file-auth = "localhost"
/ host
Identity transform round-trips for structurally invalid values can not be assured; I will be adding the capability for Scheme implementations to perform validation such that a warning can be issued if an attempt is made to utilize a path-relative
component. (On a custom FileScheme, not the base URLScheme, given it's scheme-specific behaviour.)
Thank you for pointing me at RFC8089. I agree that that says that relative paths are not supported in file scheme. RFC3986 doesn't care but RFC8089 overrides that.
Would it be possible to issue a warning if a URI string is being created from a file relative path so that the round trip failure is more obvious?
Would it be possible to issue a warning if a URI string is being created from a file relative path so that the round trip failure is more obvious?
Absolutely; that is what my last comment was trying to state is my plan. It shouldn't outright fail (since there does seem to be a general sense that //
-omitting file:
is "acceptable" to humans, but it certainly should issue a warnings.warn
.
Great. Just to confirm what I think you are saying, file:relative/path.ext
should be accepted as now (as also happens with furl
and urllib
) but if someone asks for the URI string to be reconstructed a warning should be issued because now it has taken a relative path
and converted it to an absolute path.
More eager than that: any attempt to construct a file:
URI without a rooted path will emit a warning at instantiation time unless that warning is explicitly silenced. I like to notify people of incoming foot-shots before they pull the trigger. ;)