daisy/pipeline

Conversion of Page Breaks

clapierre opened this issue · 5 comments

We noticed in Bookshare that the PageNumbers are being incorrectly converted and is not taking advantage of the new accessibility semantics of DPUB-ARIA. Also the previous way of using title="###" is no longer recommended. This was found from NIMAS books being converted at Bookshare using the latest DAISY pipeline.

Eg:
[Input]

<pagenum id="p72" page="normal">72</pagenum>

[Current Output]

<span epub:type="pagebreak" title="72" id="p72" />

[Expected Output]

<span role=“doc-pagebreak” aria-label=“ 72. " id="p72" />

[Alternative Acceptable Output]

<span epub:type="pagebreak" role=“doc-pagebreak” aria-label=“ 72. " id="p72" />

epub:type has no AT semantics and is not even needed, fine to keep but more importantly having role=“doc-pagebreak” must be included.

'title' should not be used and is usually skipped by AT, 'aria-label' is the best option here to allow for that page number to be spoken correctly. It is also not recommended to include both 'title' and 'aria-label'.

Notice the extra spaces and period after the number, to help with AT speaking the page number correctly and not adding it to other text surrounding it.

Ideally it would be aria-label=“ Page 72. “ but I doubt you want to get into language conversion issues.

Thanks.

Note that the conversion script being used here is "dtbook-to-epub3", and not the NIMAS script (we convert the NIMAS to DAISY first).

Thanks! I will improve the script according to your recommendations.

Just one question: what would be harmful in including both 'title' and 'aria-label'?

aria-label works on most popular platforms to map to accessible name. Title maps inconsistently, or at least it used to (to accessible description on some and accessible name on others). Also, having both can then be a source of double speaking.

Thanks for the explanation @sinabahram. Note that for now we haven't dropped title yet. (Because I didn't get a response in time, and now the release has been made.)