Add section about ID/IDREF values and our conventions
Closed this issue ยท 6 comments
Situation
IDs are everywhere: we use it to refer figures, sections, and other structures. We even use it as filenames for HTML. Our IDs show up even in URLs of the documentation server.
However, our styleguide doesn't have a section what's allowed inside a xml:id
attribute. It seems, we never really mentioned how IDs should look like nor what's allowed.
Proposed solution
I'd suggest to create a section to describe these two aspects:
- what's allowed inside
xml:id
andlinkend
attributes.
This is a technical description. For GeekoDoc, you can mostly refer to the regex pattern[\-0-9a-zA-Z]+
. For details, see the reference. - our conventions
This goes a bit beyond the pure technical description and documents our conventions. How should the IDs look like? What's our recommendations?
For example, for the SLE documents we start a section withsec
, a chapter withcha
and so on.
In the light of SmartDocs, the convention is probably different.
References
- Identifiers
- Definition of the
xml:id
andlinkend
attributes in the GeekoDoc schema.
FYR, the info on identifiers is located here in DSG.
In contrast to what @tomschr says, xml:ids are limited to [a-zA-Z]. Numbers are allowed by definition, but will break. I came across this recently:
Fatal error:
/home/cwickert/git/github.com/suse/doc-public-cloud/xml/support.xml:
ERROR: xml:id : attribute value 3rd-level-support is not an NCName (line 17 column 45)
We should clarify this in the Style Guide. As for the actual content, we should SEO where possible, means use human-readable terms that people are likely searching for.
Just an addition:
xml:ids are limited to [a-zA-Z]. Numbers are allowed by definition, but will break.
The basic definition of NCName (non-colonized name) is from the XML spec:
[4] NCName ::= (Letter | '_') (NCNameChar)*
[5] NCNameChar ::= Letter | Digit | '.' | '-' | '_' | CombiningChar | Extender
And that's why it shows the error message when you use an ID like 3rd-level-support
(it starts with a number, but should be a letter or underscore).
What GeekoDoc does, is to limit the definition of XML IDs to a subset only. For example, underscores are not allowed. But GeekoDoc cannot alter the basic traits of NCName.
Maybe I didn't make myself clear. The regex for GeekoDoc works as expected. It doesn't allow a number as first character. Everything that follows can be any lowercase letter or number, plus a dash (-). It doesn't allow underscores or dots. This matches the description in the styleguide.
If I recall correctly, when I created this issue the description was not so elaborate. But I will look tomorrow again.
In regards to SEO, maybe we should get rid of prefixes. However, we can't change it for our legacy docs. That would destroy all links and would be a linking nightmare. ๐ That's probably something for future docs.
I think the section about identifiers is fine now. ๐
Closing as this seems to be sorted.