/Fluency

A desktop application to assist in learning languages. Uses a deep learning model to generate translations.

Primary LanguageJupyter Notebook

Fluency

  • A desktop notification app that uses a Seq2Seq model to allow users to create various language notifications.

Image Alt Text

Transforming English to French Language Translation

Data Preparation

In this section, we will prepare our dataset for training by performing the following tasks:

  • Clean the text data by removing punctuation symbols, numbers, and converting characters to lowercase.
  • Replace Unicode characters with their ASCII equivalents.
  • Determine the maximum sequence length of both English and French phrases to establish input and output sequence lengths for our model.

Handling language data formatting

english_text french_text
0 youre very clever [start] vous etes fort ingenieuse [end]
1 are there kids [start] y atil des enfants [end]
2 come in [start] entrez [end]
3 wheres boston [start] ou est boston [end]
4 you see what i mean [start] vous voyez ce que je veux dire [end]

Language Tokenization

⚒️ We will tokenize the English and French phrases using separate Tokenizer instances and generate padded sequences for model training. The steps involved are as follows:

  1. Fit a Tokenizer to the English phrases and another Tokenizer to their French equivalents.
  2. Compute the vocabulary sizes based on the Tokenizer instances.
  3. Create padded sequences for all phrases.
  4. Prepare features and labels for training:
  • The features consist of the padded English sequences and the padded French sequences excluding the [end] tokens.
  • The labels consist of the padded French sequences excluding the [start] tokens.

Model Training and Evaluation

We train 🚂 the model and evaluate its performance on the validation set. Below are the current learning assessment metrics.

Evaluate the model's performance

1563/1563 [==============================] - 14s 9ms/step - loss: 0.2290 - accuracy: 0.8512
Test Loss: 0.22895030677318573
Validation Accuracy: 0.8511516451835632

Assess the model's learning accuracy

Image Alt Text

Translation Testing

Handle the translation process based on the model's predictions.

English: let us out of here => French: laissenous sortir dici
English: it could be fun => French: ca pourrait etre marrant
English: this is my new video => French: cest ma nouvelle video
English: do you like fish => French: aimestu le poisson
English: you were in a coma => French: vous etiez dans le coma
English: dont be upset => French: ne soyez pas fache
English: didnt you know that => French: le saviezvous
English: im not exactly sure => French: je nen suis pas a la tete
English: i put it on your desk => French: je lai mise sur votre bureau
English: somehow tom knew => French: pourtant tom savait

Translation Comparison

Compare against Baseline model is: LibreTranslate which uses a NMT Model architecture

English: let us out of here => French: laissez-nous sortir d'ici
English: it could be fun => French: ça pourrait être amusant
English: this is my new video => French: c'est ma nouvelle vidéo
English: do you like fish => French: vous aimez le poisson
English: you were in a coma => French: tu étais dans le coma
English: dont be upset => French: ne soyez pas contrarié
English: didnt you know that => French: tu ne savais pas que
English: im not exactly sure => French: im pas exactement sûr
English: i put it on your desk => French: je l'ai mis sur ton bureau
English: somehow tom knew => French: tom le savait