/manglish_lyrics_generator

TinkerHub Build From Home || ML track

Primary LanguagePython

BFH Banner

Sithara Song Generator

A Manglish lyrics generator that can give you Lyrics of a Malayalam Song that doesn't exist! in our favourite Manglish language. When User requests for one, API call is sent and pretrained model generates a lyrics and it is returned as response and displayed for you using JavaScript.

You can try the manglish_lyrics_generator by visiting the website here.

Team members

  1. Nanda Kishor M Pai
  2. Aswin Jayaji
  3. Hari Krishnan U

Team Id

BFH/recEHiCGthePHSlQQ/2021

Link to product walkthrough

Watch the video by clicking image below

Product Walkthrough

How it Works ?

Framework and Model Architecture

A Deep Learning model with an architecture including an Embedding layer, LSTMs and a fully connected layer were built using PyTorch and the model was trained on it.

  1. The project website can be found here. A manglish lyrics gets generated when the user enters one or more manglish keyword and clicks the Generate Lyrics button and the lyrics gets cleared when the user uses the Clear button. If the keyword entered by the user is not found in our dataset, then an alert is shown and the user is given sample keywords from which they can choose from. The website was built using HTML, CSS and JS along with axios module for API request and response. On pressing the Generate Lyrics button, an API GET request gets called which returns the lyrics. The API is built on the micro-web framework Flask. The ML model along with the prediction script is incorporated into Flask and deployed onto Heroku after testing it locally. The api can be found here. The source code regarding the api and website can be found in the api and frontend folders respectively.

    The dataset collected was splitted into train and validation in the ratio 85:15. Minimal Data Preprocessing was done on the collected dataset, as more would have result in the loss of song lyrics structure. We obtained a validation CrossEntropyLoss of 3.325. Since the training was RAM intensive, it was carried out in Goole Colab and the jupyter notebook used for training and validation can be found here. The Validation Loss, was found out and was compared with the number of epochs, using suitable plot diagrams drawn with matplotlib. Required Hypertuning of parameters was done by varying batch size, number of epochs and learning rate. The source code regarding the model can be found in model folder.

    For collecting the dataset, we scraped the lyrics (in Manglish) of Sitara songs from the internet. We used this website to scrape them and organized the lyrics in separate text files for training and validation. We used Beautiful Soup for web scraping. The source code regarding the same can be found in scrapper folder.

Live Demo

Watch the video by clicking image below

Live Demo

Libraries used

  • pandas - 1.2.4

  • numpy - 1.20.3

  • torch - 1.8.0+cpu

  • matplotlib - 3.4.2

  • flask - 2.0.0

  • gunicorn - 20.1.0

  • beautifulsoup4 - 4.9.3

How to configure

Inorder to train the model, load the python jupyter notebook found here in a Google Colab and make a copy of it for your use.

How to Run

After configuring the project as above mentioned, Run cells of the notebook in chronological order and download the model after training.

To see the Live demo, try this website.