Open Islamicate Texts Initiative (OpenITI)
To cite the latest release properly, see: http://doi.org/10.5281/zenodo.4075046
Pinned Repositories
acdc_train
Automatic Collation for Diversifying Corpora
Annotation
Description of the Project. If you have any suggestions for the entire project, please, add it as an issue to this repository!
arabic_print_data
mARkdown_scheme
OpenITI mARkdown scheme
OCR_GS_Data
Double-checked Gold Standard Data for Training and Testing OCR Engines
ocr_with_kraken_public
OCR texts with Kraken for OpenITI on your machine
openiti
python library
RELEASE
OpenITI releases
tei_openiti
TEI customization and schema for representing OpenITI mARkdown documents in TEI XML
TrainingData
OpenITI Training Data
Open Islamicate Texts Initiative (OpenITI)'s Repositories
OpenITI/RAW_Zaydiyya
OpenITI/8
OpenITI/arabic_ms_data
OpenITI/mARkdown_highlighting_Kate
OpenITI mARkdown highlighting schema for Kate editor
OpenITI/documentation
Cumulative documentation for the project.
OpenITI/kraken
OCR engine for all the languages
OpenITI/openiti
python library
OpenITI/0025AH
OpenITI/i.mech-20
OpenITI/i.mech-v5
OpenITI/i.mech
OpenITI/RELEASE
OpenITI releases
OpenITI/8001AH
OpenITI/0100AH
OpenITI/0075AH
OpenITI/0050AH
OpenITI/mARkdownMSS
Developing mARkdown for MANUSCRIPT transcriptions.
OpenITI/1500AH
OpenITI/aocp_ms_eval
AOCP Manuscript Evaluation Data
OpenITI/oitei
OpenITI TEI Converter
OpenITI/PER1025AH
Texts written in Persian by authors who died in the 25 years up to 1025 AH
OpenITI/PER0850AH
Texts written in Persian by authors who died in the 25 years up to 0850 AH
OpenITI/PER1350AH
Texts written in Persian by authors who died in the 25 years up to 1350 AH
OpenITI/PER0650AH
Texts written in Persian by authors who died in the 25 years up to 0650 AH
OpenITI/PER0375AH
Texts written in Persian by authors who died in the 25 years up to 0375 AH
OpenITI/oimdp
OpenITI mARkdown Parser
OpenITI/RAW_Hindawi
Unconverted files from Hindawi library
OpenITI/OpenITI_CL
This version of the corpus contains CLEAN texts, with all PARA-EDITORIALS removed. PARAEDITORIALS include editorial introductions, footnotes, indices. These texts are meant for language-modeling tasks.
OpenITI/pubs
A repository for OpenITI's digital publications
OpenITI/arabic_script_ocr_models
Repository of current Arabic script OCR models available for download