Pinned Repositories
A-Monolingual-Arabic-Parallel-Corpus-
ANLP_dataset
Arabic-Corpus-for-Error-Detection
Arabic-Humor
The Arabic humor dataset was collected using Twint and Sketch Engine and it consists of 10k tweets.
Arabic-Paraphrased-Dataset
The Arabic paraphrased parallel dataset, sourced from diverse origins and expanded through data augmentation, is invaluable in NLP. It aids education, boosts search engines, supports content creation, aids social media and domain-specific applications, and advances language technology.
Arabic-Patents
Arabic-Topic-Modeling
BERT for Arabic Topic Modeling: An Experimental Study on BERTopic Technique
ArabicSurvey
مستودع الأوراق المسحية في معالجة اللغة العربية (أسبر) A Repository for survey and review papers in Arabic Natural Language processing (ANLP).
Saudi-Bank-Sentiment-Dataset
This dataset contains customers’ sentiments on Twitter toward four Saudi Banks. A total of 12k tweets 8,669 of them is labeled as "Negative", 2,143 is labeled as "Positive", and 1,236 tweets is labeled as "Neutral".
Saudi-Dialect-Irony-Dataset
The Saudi irony dataset was collected using Twitter API and it consists of 19,810 tweets, 8,089 of them are labeled as ironic tweets
iwan-rg's Repositories
iwan-rg/ArabicSurvey
مستودع الأوراق المسحية في معالجة اللغة العربية (أسبر) A Repository for survey and review papers in Arabic Natural Language processing (ANLP).
iwan-rg/Arabic-Topic-Modeling
BERT for Arabic Topic Modeling: An Experimental Study on BERTopic Technique
iwan-rg/A-Monolingual-Arabic-Parallel-Corpus-
iwan-rg/Saudi-Dialect-Irony-Dataset
The Saudi irony dataset was collected using Twitter API and it consists of 19,810 tweets, 8,089 of them are labeled as ironic tweets
iwan-rg/Saudi-Bank-Sentiment-Dataset
This dataset contains customers’ sentiments on Twitter toward four Saudi Banks. A total of 12k tweets 8,669 of them is labeled as "Negative", 2,143 is labeled as "Positive", and 1,236 tweets is labeled as "Neutral".
iwan-rg/Arabic-Humor
The Arabic humor dataset was collected using Twint and Sketch Engine and it consists of 10k tweets.
iwan-rg/ANLP_dataset
iwan-rg/Arabic-Corpus-for-Error-Detection
iwan-rg/Arabic-Paraphrased-Dataset
The Arabic paraphrased parallel dataset, sourced from diverse origins and expanded through data augmentation, is invaluable in NLP. It aids education, boosts search engines, supports content creation, aids social media and domain-specific applications, and advances language technology.
iwan-rg/Arabic-Patents
iwan-rg/ARC-WMI
A baseline results towards constructing readability corpus ARC-WMI, a new Arabic collection of written medicine information annotated with readability levels.
iwan-rg/NLP-Patents
A repository for Patents in the field of Natural Language Processing (NLP).
iwan-rg/CLEANANERCorp
CLEANANERCorp, a corrected version of the classic Arabic NER benchmark ANERcorp with updated and more consistent NER labels
iwan-rg/OpenTriviaQA
A creative commons dataset of trivia questions and answers
iwan-rg/Arabic-Spell-Checker
iwan-rg/iwan-rg.github.io
iwan-rg/iwan-website
iwan-rg/MADAD
iwan-rg/NLP-Exercises
iwan-rg/Saudi_Privacy_policy
Saudi Arabic Privacy Policy Dataset
iwan-rg/starter-hugo-research-group