ada-url/ada

Non-special URIs with hosts

the-moisrex opened this issue · 5 comments

Ada gives

urn://this/is/a/path
    | |   |         
    | |   `--------- pathname_start 10
    | |   `--------- host_end 10
    | `------------- host_start 6
    | `------------- username_end 6
    `--------------- protocol_end 4

while //this/is/a/path is a path not a host.

image

Also straight out of RFC example which just confused me:

ldap://[2001:db8::7]/c=GB?objectClass?one
     | |            |    |               
     | |            |    `--------------- search_start 25
     | |            `-------------------- pathname_start 20
     | |            `-------------------- host_end 20
     | `--------------------------------- host_start 7
     | `--------------------------------- username_end 7
     `----------------------------------- protocol_end 5

image

It confused me because the WHATWG parser says:

7 - Otherwise, if url is special, set state to special authority slashes state.
8 - Otherwise, if remaining starts with an U+002F (/), set state to path or authority state and increase pointer by 1.
9 - Otherwise, set url’s path to the empty string and set state to opaque path state.

It sets it to path or authority state if the URL is not special; which is what WHATWG says, but not what browser's implementations do, and I can't find any proof that "//" on non-special schemes would mean it's an authority start or not in the RFC.

lemire commented

Here is what my browser (WebKit) does:

Capture d’écran, le 2023-11-09 à 09 02 16 Capture d’écran, le 2023-11-09 à 09 13 51

... which matches what ada does, doesn't it?

Our reference is WHATWG URL. Can you indicate what is the issue in ada while making a reference to WHATWG URL?

If you think that the WHATWG URL specification is incorrect, then you need to take this with them.

We will only change ada if we find that we are not following the specification.

Please elaborate on what the issue.

WHATWG says what ada already does; I guess I expected chromium and firefox's implementations be WHATWG compliant, but there seem to be a difference here.

image

@the-moisrex Somewhat similar blog post of mine: https://www.yagiz.co/url-parsing-and-browser-differences

Oh, thanks. That explains it.