dsfsi/vukuzenzele-nlp
The dataset contains editions from the South African government magazine Vuk'uzenzele. Data was scraped from PDFs that have been placed in the data/raw folder. The PDFS were obtained from the Vuk'uzenzele website.
Jupyter NotebookMIT
Issues
- 0
- 1
- 1
Extract 2020 Stories
#14 opened by vukosim - 1
- 1
Download Latest Vukuzenzele in all languages
#1 opened by vukosim - 1
Investigate the use of Marker
#38 opened by vukosim - 0
Investigate the use of Marker
#37 opened by vukosim - 0
Download Vukuzenzele 2023 PDFs
#39 opened by lastrucci01 - 3
- 1
Update Sentence Alignment
#34 opened by lastrucci01 - 1
Convert /data/processed txts into a JSON doc
#32 opened by lastrucci01 - 6
Lang Code Mismatch: nso vs sot vs sep
#29 opened by lastrucci01 - 2
Update 2022 PDFs
#15 opened by vukosim - 1
Update citation
#28 opened by lastrucci01 - 1
Make comprehensive Annotation Instructions
#25 opened by lastrucci01 - 1
Sentence Align Action Failure
#23 opened by lastrucci01 - 0
- 1
Make Data statement & README
#17 opened by lastrucci01 - 3
Sentence Alignment
#16 opened by lastrucci01 - 1
Updated DATASHEET.md with relevant information
#10 opened by vukosim - 8
Auto vs Manual Extraction
#9 opened by vukosim - 0
Download All Vukuzenzele PDFs
#3 opened by vukosim