- Arabic and Urdu Unicode Redundancy Problem
- Normalization
- Urdu Single Character Normalization
- Urdu Combined Characters Normalization
- Urdu Data Pre-Processing
- Urdu Diacritics Removal
- Urdu Spaces Before & After Digits
- Urdu Spaces After Punctuations
- Urdu Joined Words Fix
- Tokenization
- Sentence Tokenization
- Words Tokenization
- Classification
- Sentimental Analysis
- Sentence Classification
- Documents Classification
Urduhack officially supports Python 3.6–3.7, and runs great on PyPy.
To install Requests, simply use pip
$ pip install urduhack
Fantastic documentation is available at https://urduhack.readthedocs.io/
- Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug. There is a Contributor Friendly tag for issues that should be ideal for people who are not very familiar with the codebase yet.
- Write a test which shows that the bug was fixed or that the feature works as expected.
- Send a pull request and bug the maintainer until it gets merged and published. :)