Locator 'fragment', allow multiple "types"?
danielweck opened this issue · 4 comments
See:
f90b440#commitcomment-33122485
Is there a use-case for recording in this data structure multiple types of roughly-equivalent "fragments"? (hence requiring a different arity for the fragment
property). For example, CFI is super precise at the character level, CSS selectors and ID less so (element-level), etc.
{
"href": "http://example.com/book/chapter.html",
"type": "text/html",
"locations": {
"fragments": [
"#para1",
"#cfi(/2/4/6)",
"#css(.body#main .section:nth-child(4) #para1)"
]
}
}
There's definitely a use case for that but we're suggesting a different syntax for now:
{
"href": "http://example.com/book/chapter.html",
"type": "text/html",
"locations": {
"fragment": "para1&cfi(/2/4/6)&css(.body#main .section:nth-child(4) #para1)"
}
}
This syntax is hard to read (by humans) and error-prone to parse (by machines). In addition to character escaping rules inside the double-quoted JSON fragment
value, there must also be some escaping inside the cfi()
scheme ( http://www.idpf.org/epub/linking/cfi/epub-cfi.html#sec-epubcfi-escaping ), the css()
scheme (albeit undefined at this stage), etc.
A plural fragments
array property would be much easier to decipher.
On a related note, we probably need to refine the ingestion model for such multiple "fragments" (regardless of whether they are array-exploded or string-linearised). A reading system / processing agent will want to pick a "fragment" that best matches its capabilities. For example: use CFI if supported, otherwise fallback to CSS selector, otherwise unique ID, etc.
The standard URI fragment parsing rules would apply, e.g. XPointer xpointer()
scheme is "discovered" by matching the prefix string of characters "xpointer(" immediately after the hash #
character. So please note that I am not advocating for strongly-typed "fragments" such as:
{
"href": "http://example.com/book/chapter.html",
"type": "text/html",
"locations": {
"fragmentID": "para1",
"fragmentCFI": "/2/4/6",
"fragmentCSS": ".body#main .section:nth-child(4) #para1"
}
}
...but maybe I should! ;)