Reviews_NLP

Background

In this project, we will use data from the Kaggle. It's some Arabic datasets are on the market for classification comparison and different NLP tasks. This dataset is principally a compilation of many available datasets and a sampling of 100k rows.it is a talk about reviews and this review a three type Negative, Positive and Mixed.

Prediction

Create a model to Predict the type of text it's Negative Positive or Mixed.

Data Description

The dataset combines reviews of hotels, books, movies, products, and some airlines. It has three classes (Mixed, Negative and Positive). Most were mapped from rater scores with a mix of 3, more than 3 positives, and less than 3 negatives. Each line has a label and text separated by tabs (tsv). The (reviews) text has been cleaned up by removing Arabic diacritics and non-Arabic characters. The dataset does not have duplicate revisions.

Field Name	Description
Label	User 'sentiment': Mixed, Negative, Positive
Text	Review text

Number of rows = 100000 rows

Number of columns = 2 columns

tools

Technologies

Python
Jupyter Notebook
PowerPoint for presentation
web

Libraries

ArabicLightStemmer
libqutrub.conjugator
naftawayh.wordtag
tashaphyne.stemming
plotly.graph_objs
TruncatedSVD
TfidfVectorizer
CountVectorizer
NMF
strip_tatweel
strip_shadda
FarasaPOSTagger
FarasaNamedEntityRecognizer
FarasaDiacritizer
FarasaSegmenter
FarasaStemmer
qalsadi.lemmatizer
pandas
numpy
sklearn.linear_model
sklearn.model_selection
sklearn.preprocessing
sklearn.metrics
matplotlib.pyplot
seaborn
string
nltk
warnings