w3c/alreq

Terminology

TitusNemeth opened this issue · 6 comments

Typographic terminology should be followed where established.

Section 2.7 Font and typographical considerations
A font is not a typeface and vice-versa. A font is an instantiation of a typeface in a particular technology encompassing all character descriptions (sorts, glyphs) of a particular style (historically also size), whereas a typeface is the image that is created by these configurations on a support, irrespective of technology. So there is a hot-metal Monotype Series 549 and a photocomposition font Monotype Series 549, but only one typeface. There is an Adobe Arabic typeface, but it contains four fonts (Regular, Bold, Italic, Bold Italic).

Section 2.6 is called Diacritics. I am not sure if this is the best term. Diacritical signs are usually considered to be a part of the letter, as in the dot of an i or the dot below the rasm of a beh ب. What is discussed in 2.6 are tashkīl, in Western type terminology usually simplified as 'vocalisation'. I think we should distinguish between diacritics and vowels and other combining marks, as they have different functions, different semantics, and work very differently in Unicode and (most) current font technologies. Given the prominence of vowels amongst such combining marks, I think it may be admissible to call them 'vowels', with a note explaining that there are also other orthographic and grammatical signs that behave similarly.

r12a commented

I suspect that actually the Diacritics section is currently referring to all combining characters, be they tashkīl or other, since the first sentence talks in terms of code point positioning.

r12a commented

Btw, the glossary refers to the dots as ijam, though we don't appear to use that term in the body of the document. There is also a definition of tashkil.

Makes sense to me to begin that section with a sentence that defines its scope and relevant terminology.

r12a commented

To make sure i'm clear, is the issue about font vs typface related specifically to section 2.7.4 https://w3c.github.io/alreq/#h_fonts or more widespread in section 2.7? Also, presumably, there is no issue with the use of the term 'font style' in the earlier part of 2.7(?)

Re. ijam – this opens a much wider question: should Arabic terms be used, or should English terms be used, or a combination thereof? This can become quite tricky and controversial, not least because we're not talking about Arabic and Persian only. Urdu, Sindhi, etc. may have different terms again. It's a bit like the case of everybody talking about 'Farsi' rather than 'Persian' in an English language context. It's received a bit of criticism and mockery.

I would think that it makes sense to use 'generally known' terms such as Naskh, and those without equivalent in English (say rasm) but I wonder where the line should be drawn for those that have an English equivalent. Based on the glossary definition ijam are diacritical signs.

I guess the font – typeface distinction is lost in most conversations about typography on the web. People tend to use it synonymously, but it would be preferable to maintain a distinction. What is a 'font style' compared to a 'type style' and a 'writing style'? I'm sceptical of using 'font style' to describe a writing style, as happens in 2.7.

r12a commented

Re. ijam – this opens a much wider question: should Arabic terms be used, or should English terms be used, or a combination thereof?

Indeed. See a similar comment i just left on #204 (comment)

Note that we normally try to align our terminology and definitions with the Unicode glossary. According to that definition a diacritic can include accent marks (which in Unicode may or may not be represented using combining characters). It's a definition which i always took to be applicable to all sorts of short visual marks applied to base characters, including the tashkil, but it's definitely fairly vague as to what exactly is included. I suspect that if we use the word diacritic we should use it in a vague way, but we should be more rigorous in many places where we refer to ijam and tashkil. Note, btw, that we are indeed only talking about Arabic and Persian here, so i'm not sure we need to worry about implications of Urdu etc.(?)

I've found the passage in the Standard that you may refer to (the link pointed to the other issue):

In this section and the following section, the terms nonspacing mark and combining character
are used interchangeably. The terms diacritic, accent, stress mark, Hebrew point, Arabic
vowel, and others are sometimes used instead of nonspacing mark. (They refer to
particular types of nonspacing marks.) Properly speaking, a nonspacing mark is any combining
character that does not add space along the writing direction. For a formal definition
of nonspacing mark, see Section 3.6, Combination.

where the definition of combining marks says:

They include such characters as accents, diacritics, Hebrew points,
Arabic vowel signs, and Indic matras.

So my reading is this: the Standard recognises that these are different things, but uses them in the body of the text interchangeably (which it shouldn't, maybe it's just to be less repetitive to read). In other words, whilst they may all be combining marks, diacritical signs are not vowels, nor accents, nor Hebrew points etc.

Re. other languages using the Arabic script: I see that this document delimits its scope with Arabic and Persian. Realistically, though, it will be used either as a model, or without changes for all languages using the Arabic script (this here has been years in the making, by whom and when will the same be done for other contexts?). And probably this is reasonable, as most aspects that should be covered here will also be applicable for other languages. But I think we should be as inclusive in our outlook as possible, and make it as much as possible applicable to the Arabic script, rather than a couple of languages.