/Reviews_NLP

Primary LanguageJupyter Notebook

Reviews_NLP

profile_KNA

Background

In this project, we will use data from the Kaggle. It's some Arabic datasets are on the market for classification comparison and different NLP tasks. This dataset is principally a compilation of many available datasets and a sampling of 100k rows.it is a talk about reviews and this review a three type Negative, Positive and Mixed.



Prediction

Create a model to Predict the type of text it's Negative Positive or Mixed.

Data Description

The dataset combines reviews of hotels, books, movies, products, and some airlines. It has three classes (Mixed, Negative and Positive). Most were mapped from rater scores with a mix of 3, more than 3 positives, and less than 3 negatives. Each line has a label and text separated by tabs (tsv). The (reviews) text has been cleaned up by removing Arabic diacritics and non-Arabic characters. The dataset does not have duplicate revisions.

Field Name Description
Label User 'sentiment': Mixed, Negative, Positive
Text Review text


  • Number of rows = 100000 rows
  • Number of columns = 2 columns

tools

Technologies

  • Python
  • Jupyter Notebook
  • PowerPoint for presentation
  • web

Libraries

  • ArabicLightStemmer
  • libqutrub.conjugator
  • naftawayh.wordtag
  • tashaphyne.stemming
  • plotly.graph_objs
  • TruncatedSVD
  • TfidfVectorizer
  • CountVectorizer
  • NMF
  • strip_tatweel
  • strip_shadda
  • FarasaPOSTagger
  • FarasaNamedEntityRecognizer
  • FarasaDiacritizer
  • FarasaSegmenter
  • FarasaStemmer
  • qalsadi.lemmatizer
  • pandas
  • numpy
  • sklearn.linear_model
  • sklearn.model_selection
  • sklearn.preprocessing
  • sklearn.metrics
  • matplotlib.pyplot
  • seaborn
  • string
  • nltk
  • warnings