Deep-Learning-Project

Project for the Deep Learning course - University of Bologna, Cesena Campus

Authors:

  • Veronika Folin
  • Anna Vitali
  • Elena Yan

Description of the project


Objective

The aim of this project is to evaluate the performance of several deep learning models on the Grammatical Error Correction (GEC) task, which consists to transform a potentially wrong input sentence to a corrected version.

There are different types of errors within the input text, such as spelling, punctuation, grammatical, or words.

Dataset

The dataset used for experiments is C4 200M Synthetic Dataset, a collection of 185M sentence pairs with grammatical errors generated from C4 (Colossal Clean Crawled Corpus).

Models

We tested the following types of model:

  • Encoder-Decoder with Recurrent Neural Networks
  • Transformer

Optimizer

  • Adam
  • RMSprop

Loss Functions

  • categorical_crossentropy
  • sparse_categorical_crossentropy

Evaluation Metrics

We used the following evaluation metrics to assess model performance:

Implementation and results


You can find the work done in the final notebook, which is the notebook.ipynb file.

A summary of experiments' configuration and results are in the experiment_results.xlsx file.