/parallel-mt-inference

This repository is for the COMS 6998 Practical Deep Learning System Performance course final project that I took at Columbia (https://www.cs.columbia.edu/education/ms/fall-2020-topics-courses/#e6998010). In this project, my teammate and I investigate parallelism in NLP. We experimented on how parallelism (e.g. using multi-head attention instead of recurrent connection and splitting input for inference) affects model performance (accuracy and speed-wise). More on it here http://bit.ly/pract-dl-final-report.

Primary LanguagePythonMIT LicenseMIT

Watchers