lang tags: using BCP47 instead of ISO639-1 codes
Opened this issue · 2 comments
Hello, first thank you very much for your work on hocr! I'm part of an organization that gets hocr from Google Books and I'm quite new to the specification. Something that caught my eye is the reference to ISO639-1 for language codes. Since it doesn't contain all language codes, I think referring to BCP47 is more generic and future-proof. What do you think? It's a retro-compatible change since ISO639-1 tags are BCP47 compliant (at least in a first approximation)
thanks for your answer!
My understanding of the latest XSD spec is that it requires BCP47 lang tags, the 1.0 spec indeed refers to RFC1766. I don't think there might be any reason why RFC1766 should be recommended instead of BCP47, but perhaps there are some?