/Goodreads-app-Heroku

Predicting multi tags of quotes based on Goodreads quotes.

Primary LanguageJupyter Notebook

Goodreads-app-Heroku

Predicting English quote tags from Franco-Arabic or English Language.

Goodreads

What i did in the project:

  • Start with scraping all quotes in Goodreads that are 82460 quotes with 27 label, that each label have 2945 quote.
  • Makes all preprocessing pipeline for cleaning data.
  • Makes some of EDA 'Exploratory Data Analysis' for each words appear with all tags and alos word cloud for visualization, feature engineering for knowing lenght fo each quote and number of words in each one.
  • Showing most frequent n-grams "one, two" words appear in each tag.
  • Makes frequent tags which is appear in data, and customize the tags by the top 20 tags appear.
  • Modeling as a ML models for multi class classification and also DL model by RoBERTa.

Publishing data.


Check out the YouTube videos showing whole project here

The deployed web app is live at here