/Digikala-comment-verification-project

comment verification on digikala website dataset

Primary LanguageJupyter NotebookMIT LicenseMIT

digikala-comment-verification-project

Comment Verification using digikala dataset


Table of Contents


Description

In this project, I manage to solve a common supervised learning problem (comment verification) using python.
Generally, I follow a pipeline for each one of my projects and this NLP project is not an exception. The pipeline that I followed can be accessed in the images folder.

  • Firstly, I am going to try several data cleaning techniques using regular expressions and built-in python methods.
  • Secondly, I will build my initial document-term matrix to feed it to my machine learning models.
  • During the project, I came across the class-imbalance problem which is common in machine learning problems, so I am going to apply several methods to overcome this problem.
  • I implemented Multinigual-BERT because I did not find any Persian-BERT model that can be useful.

Techniques

  • Natural Language Processing (NLP)
    • Data Cleaning
    • Exploratory Data Analysis
    • Machine Learning
    • Deep Learning
    • Text Classification

Back To The Top


License

MIT License

Copyright (c) [2020] [Shakib Yazdani]

Back To The Top


Author Info

Back To The Top