/NMTS

Neural Machine Translation (attention and zero-shot)

Primary LanguagePython

#Neural Machine Translation (attention and zero-shot) This project is developed during Prof. Andrew Ng's deep learning boot camp, where we implemented and explored two states of art machine translation models: Attention-based NMT and GNMT. The details of the models could be accessed by both the original paper listed or my review in project reports.

The architecture (of the repo, not the model) refers to the implementation of MemN2N quite a lot: We separate the model from data, logging, and preprocessing modules, and use a single interface for ML model of similar tasks.

  • main.py: Configuration interface. It will parse parameter configs, build the model and run training/testing/sampling experiments.
  • data\: Training data and the wrapping data iterator.
  • checkpoints\: Checkpoints of trained weights.
  • logs\: Logs of loss, accuracy, etc.
  • model\: Deep learning models. Here we have attention.py and zero.py, both of which inherit the Model class
    • class Model:
      • build_variables(): Prepare the variable placeholder
      • build_model(): Prepare the model
      • train(): Train the model
      • test(): Evaluate the test error
      • sample(): Sample/translate certain sentences
      • countParameters(): Count the parameters in the model
      • save(): Save the model to checkpoints
      • load(): Load the model from checkpoints
    • attention.py: Attention-based model. See Luong et al. 2015
    • zero.py: Google's GNMT model. See Y Wu et al. 2016
  • bleu\: Scripts to calculate BLEU score.
  • subword-nmt\: Word-piece model's token generator used in zero-shot translation.
  • experiments\: Training experiments (parameter configuration).
  • report\: Summary/notes of training experiments.

  • Our mentors: Ziang, Awni and Anand
  • Terrific cohort: James, Jeremy, Joseph and Dillon