Language Classification with Naive Bayes

Design a model that can classify sentences into one of Slovak, Czech, and English.

Introduction

This is a practical part of the Coursera Guided Project created by the Coursera Project Network.

Project Objectives

  • Explore and visualize dataset to identify key issues in training data
  • Preprocess and clean dataset using relevant methods
  • Explore theory of Naive Bayes Modeling, and train first model
  • Develop your approach using learned theory to improve performance
  • Design new approach using subwords to mitigate class imbalance problems