WICG/scroll-to-text-fragment

XPointer Framework syntaxt *style* fragments

BigBlueHat opened this issue · 8 comments

Thinking through the "mixing with SPA's" issue in #15, I wonder if the XPointer Framework syntax's style might be useful to avoid collisions and enable "mixing."

We used this approach in the Web Annotation WG's Selectors and States note. The same syntax style can be seen in SVG's fragment identifiers like #svgView()

This style is less frequently used (other than the standardized ones like svgView()) by JS developers building SPAs. Additionally, these can more easily accommodate mixing and more clearly be "functional"--which avoids confusion between which client this fragment was intended for (the browser or the Web app).

For example using find() rather than targetText= avoids collision with site key/value pairs while also feeling more functional/active.

http://example.com/spa#page_id=5&find("on sale")

Structuring the URL has the interesting side-effect that when the location.hash is parsed (sans-#) with URLSearchParams the results shake out into two groups:

  • the key/value pairs (i.e. page_id=5)
  • the non-key/value pairs (i.e. find("on sale")):

Current JS demo below....

let u = new URL('http://example.com/#page_id=5&find("on sale")');
let usp = new URLSearchParams(u.hash.substr(1));
for (let p of usp.entries()) console.log(p);
# Array [ "page_id", "5" ]
# Array [ "find(\"on sale\")", ""]

This avoids "magic strings" (like targetText) polluting the key space, while still making these easily shimmable via JS.

Additionally, once there is a find() (or targetText() or selector() or whatever) these could be gleaned by the browser engine (possibly removing it from the hash) and kept for its own use--while still following existing URL design and extensibility patterns.

It would also open the door to a more extensible space and patterns for future development, by suggesting that # values may be & delimited and may contain "functional fragments" which may be ignored or implemented by the client (whether browser or Web app).

Thoughts?

I think the functional style may make sense but this doesn't solve the issue of bad interaction with SPAs. For example, they might not handle a & delimiter. Going back to the example: https://webmd.com/skin-problems-and-treatments/lice-treatment this page breaks with any hash fragment that it doesn't expect.

If this was just a handful of pages we could maybe say the cost-benefit doesn't favor influencing the design. However, this example was found with just some basic testing and WebMD is a prominent domain. I expect this may not be rare.

Let's just avoid changing the language to match the phrasebook (see also Monty Python's cautionary tale). 😄

If we have fragments that are meant "for the browser" we might have to avoid exposing them to the page. That would be a nice privacy bonus, as well, but is a deeper implementation.

So it looks like the "[fragments meant] for the browser" is becoming a reality per #15 (comment)

It would still be great to see the fragment style designed to be extensible--as other Web clients could benefit from this use also.

I'd add a point of caution, we're currently just experimenting with this - there's no guarantee this will ship in this state.

It would still be great to see the fragment style designed to be extensible--as other Web clients could benefit from this use also.

What do you mean by this? If ## were to ship in its current form, it could be used to implement other similar features but maybe that's not what you're getting at?

What do you mean by this? If ## were to ship in its current form, it could be used to implement other similar features but maybe that's not what you're getting at?

I've no real issue with ## and think it opens a wide range of interesting/valuable architectural opportunities. It's more the shape of targetText= as a key/value pair which keeps me up at night. 😉

Given the current README example of...

https://en.wikipedia.org/wiki/Cat#targetText=Claws-,Like%20almost,the%20Felidae%2C,-cats

...I'd find this approach to be more intuitive:

https://en.wikipedia.org/wiki/Cat#targetText(Claws,Like%20almost,the%20Felidae%2C,cats)

...as each part reads as a parameter to a function vs. a single value made of a unique, comma-separated micro-syntax (i.e. "If provided, the prefix must end (and suffix must begin) with a dash (-) character.").

...I'd find this approach to be more intuitive:

It's certainly a matter of taste but I think the two are functionally equivalent. The dash micro-syntax or something like it would still be needed since all but the second parameter are optional so you need some way to disambiguate the different possibilities.

We've stuck with the key-value syntax; one of the main benefits being its similarity to the media-fragments syntax.

Closing this out as I think this is effectively obsolete. I think most of the goals stated here are effectively met (the spec leaves open the possibility of extension via & and new directive types).

Feel free to reply if there's still something unresolved.