/Language-Prediction

This project compares deep learning techniques for multilingual text classification, focusing on language detection and classification using FastText and Sentence Transformer embeddings. It provides a dataset, requirements, and highlights the significance of training on a large multilingual corpus for improved performance.

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

Language Prediction

Requirements

The project requires the following packages to be installed:

  • Pandas
  • NumPy
  • Scikit-learn
  • LangDetect
  • LangId
  • FastText
  • Sentence Transformer
  • TensorFlow
  • Matplotlib
  • Seaborn

To install these packages, you can run the following command:

> pip install pandas numpy scikit-learn langdetect langid fasttext sentence-transformers tensorflow matplotlib seaborn

To run the project, you can follow these steps:

  • Clone the repository to your local machine:
> git clone https://github.com/jaywyawhare/Language-Prediction.git
  • Navigate to the project directory:
> cd Language-Prediction
  • Install the required packages:
> pip install -r requirements.txt
  • To deploy it:
> streamlit run main.py

The script will load the dataset, preprocess the data, train and evaluate the models, and display the results.