Quran

Overview
- Structure of the Quran
- Directory Tree
Instructions
Disclaimer
Other Related Work

Overview

The Quran is the official religious text of Islam. The one and only book for Muslims which has been read and memorized by thousands across the globe in the Arabic language.

The goal for this repository is to have the Quran in a readeable data format to apply a data science question.

In this repository, some words in this README.md and variable names in the respective .ipynb/.py files will be in Arabic though written in English. Here's a glossary of those terms:

Structure of the Quran

Surah: Chapter (e.g. The Quran has a total of 114 chapters)
Ayah: Verse
Juz: Part (i.e. The Quran is divided into 30 parts)

Mock data. See data folder for updated dataset

Directory Tree

quran_dataset
├── data
│   ├── holy_quran.json 
│   ├── surahs # CSV file for all 114 surahs
├── img
│   ├── quran_data_ch_1-2.png # for README.md
├── README.md
├── quran_df.ipynb # in-progress
├── quran_df.py # in-progress
├── quran_json_download.py 
├── surah_df.ipynb
└── surah_df.py

Instructions

Create the following folders/subfolders locally
- data
- data/surahs
Install the following libraries: json, numpy, pandas, requests, scipy
Run quran_df.py (Quran CSV file) and/or surah_df.py (Surah CSV file. If you want data for multiple surahs, simply run the file again)

Disclaimer

The data compiled here (holy_quran.json) is derived from Al Quran Cloud API.

The purpose and intent of this repository is to aid in the understanding of the Quran.

What do I mean by understanding? There are many data science tools/methods one can explore by using this data, NOT understanding the text itself, as the data compiled here has no related features.

If there's an error in the ayahs column (e.g., missing a word, تشكيل/Tashkil (i.e. Arabic vowelization), ayahs being out of order, etc.), one should consult the Quran as that's the only correct source.

The data presented does not and will never precede the Quran.

bzekeria/quran_dataset