XML data from https://tipitaka.org/
From README in https://github.com/VipassanaTech/tipitaka-xml
These files are made freely available for non-commericial use. Please attribute Vipassana Research Institute when incorporating these files into your projects.
These files are updated from time to time in order to correct errors that come to light. So, please ensure that you set up your project to be able to sync with this repo to ensure that you are working with the latest versions.
Script
อักษรไทย (Thai): Pending issues
Known issues with Thai font
- Two Thai characters need to be modified to have little subscript additions removed to make it look the usual way that the character looks in Thai language.
සිංහල (Sinhala): Pending issues
Known issues with Sinhala font
- The unicode font used -Kaputaunicode - has some limitations, and does not render some characters properly. Please check the word in the Roman script version in case of confusion.
বাংলা (Bengali): Pending issues
Known issues with Bengali font
- tta comes in place of tva.
- both ru and ruu - matra of short and long u comes next to 'r' as in Hindi.
- half nya is coming as full nya. अरञञगतो comes as अरञञगतो
- that '0' of abbreviation, ie. सी॰ सया॰ पी॰...प॰... all o comes as blocks.
ਗੁਰਮੁਖੀ (Gurmukhi): Pending issues
Known issues with Gurmukhi font
- None of the vowel matras or the bindi combine correctly with "va".
དབུ་ཅན་ (Tibetan): Pending issues
Known issues with Tibetan font
- The letters ca and ja are being displayed with a curve on the top called “rafar”. Some experts are of the view that this curve is not required. You are free to reach your own conclusion.
- The traditional system of writing Pali in Tibetan required Dots in between letters as below – however the fonts are not able to do so.
See also