/Amazon-Machine-Learning-Challenge

Task : Classify 3 million products into around 10,000 categories

Primary LanguageJupyter Notebook

Amazon-Machine-Learning-Challenge

Dataset Link : https://s3-ap-southeast-1.amazonaws.com/he-public-data/dataset52a7b21.zip

Task

Classify 3 million products into around 10,000 categories.

Team Members:

  • Sandeep Rajakrishnan
  • Sudhay Senthilkumar

Dataset consists of columns :

  • Product Name
  • Description
  • Bullets
  • Brand
  • Product Node ID (Target)

Steps Followed

  • Preprocessed the dataset
  • Removed stop words
  • Extracted keywords
  • Combined columns and prepared the final dataset
  • Applied Count Vectorizer
  • Calculated TF-IDF
  • Fed the Data to ML Algorithms

Date of Update:

August 1, 2020