This repository is not active
hezam2022/text-categorization
We used sub data from a large Single-labeled Arabic News Articles Dataset (SANAD) of textual data collected from three news portals. The dataset is a large one consisting of almost 200k articles distributed into seven categories that we offer to the research community on Arabic computational linguistics.
Jupyter Notebook