facebookresearch/LASER

Python interface for downloading and choosing models

avidale opened this issue · 1 comments

Right now, the choice of models for a given language and their downloading are handled by a command line script https://github.com/facebookresearch/LASER/blob/main/tasks/embed/embed.sh.

We may want to re-implement this logic in Python (with an option to change the model directory), similarly to how models are downloaded from hubs in the Fairseq or transformers packages.

The interface may look like this:

from laser import TextEncoder
# loading the correct tokenizer and model checkpoint from the cache directory or from the s3 bucket
encoder = TextEncoder.from_pretrained('LASER3', language='ban_Latn') 

Done in #249