/argos-train

Training scripts for Argos Translate

Primary LanguagePythonMIT LicenseMIT

Argos Train

Argos Translate | Video tutorial

Trains an OpenNMT PyTorch model and SentencePiece tokenizer. Designed for use with Argos Translate and LibreTranslate.

Pre-trained Argos Translate packages are available for download. If you have trained models you're willing to share please reach out so they can be published on the package index.

Training example

$ su argosopentech
$ source ~/argos-train-init

...


$ argos-train
From code (ISO 639): en
To code (ISO 639): es
From name: English
To name: Spanish
Version: 1.0

...

Package saved to /home/argosopentech/argos-train/run/en_es.argosmodel

Data

Uses data from the Opus project in the Moses format stored in data index.

Environment

CUDA required, tested on vast.ai.

Docker

Docker image available at argosopentech/argostrain.

docker run -it argosopentech/argostrain /bin/bash

Run training

argos-train

Troubleshooting

  • If you're running out of GPU memory reduce batch_size and valid_batch_size in config.yml.

License

Licensed under either the MIT or CC0 License (same as Argos Translate).