LaBSE

Project

This project is an implementation to convert Google's LaBSE model from TensorFlow to PyTorch. It also offers extensions to convert the smaller-LaBSE model from TensorFlow to PyTorch, and the LEALLA family of models.

The models are uploaded to the HuggingFace Model Hub in the PyTorch HF-compatible (original and safetensors), TensorFlow and Flax formats, alongwith a compatible tokenizer.

Export

To convert and export the models:

poetry install
poetry run convert_labse --output_path /path/to/models

To update the models on the HuggingFace Model Hub:

# Clone the already uploaded models.
cd /path/to/model
git clone https://huggingface.co/setu4993/LaBSE.git

# Export models anew and update.
cd /path/to/repo
poetry install
poetry run convert_labse --output_path /path/to/models/LaBSE --huggingface_path

Export Commands by Model

LaBSE: poetry run convert_labse --output_path /path/to/models/setu4993/LaBSE --huggingface_path
smaller-LaBSE: poetry run convert_labse --output_path /path/to/models/setu4993/smaller-LaBSE --smaller --huggingface_path
LEALLA-base: poetry run convert_lealla --size base --output_path /path/to/models/setu4993/LEALLA-base --huggingface_path
LEALLA-small: poetry run convert_lealla --size small --output_path /path/to/models/setu4993/LEALLA-small --huggingface_path
LEALLA-large: poetry run convert_lealla --size large --output_path /path/to/models/setu4993/LEALLA-large --huggingface_path

Model Cards

See the model-cards directory for a copy of the model cards.

License