/MultiVae

Unifying Multimodal Variational Autoencoders (VAEs) in Pytorch

Primary LanguagePythonApache License 2.0Apache-2.0

logo

Python Documentation Status

This library implements some of the most common Multimodal Variational Autoencoders methods in a unifying framework for effective benchmarking and development. You can find the list of implemented models below. It includes ready to use datasets like MnistSvhn 🔢, CelebA 😎 and PolyMNIST, and the most used metrics : Coherences, Likelihoods and FID. It integrates model monitoring with Wandb and a quick way to save/load model from HuggingFaceHub🤗.

Implemented models

Model Paper Official Implementation
JMVAE Joint Multimodal Learning with Deep Generative Models link
MVAE Multimodal Generative Models for Scalable Weakly-Supervised Learning link
MMVAE Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models link
MoPoE Generalized Multimodal ELBO link
MVTCAE Multi-View Representation Learning via Total Correlation Objective link
JNF,JNF-DCCA Improving Multimodal Joint Variational Autoencoders through Normalizing Flows and Correlation Analysis link
MMVAE + MMVAE+: ENHANCING THE GENERATIVE QUALITY OF MULTIMODAL VAES WITHOUT COMPROMISES link

Quickstart

Install the library by running

git clone https://github.com/AgatheSenellart/multimodal_vaes.git
cd multimodal_vaes
pip install .

Load a dataset easily :

from multivae.data.datasets import MnistSvhn
train_set = train_set = MnistSvhn(data_path = 'your_data_path',split="train", download=True)

Instantiate your favorite model :

from multivae.models import MVTCAE, MVTCAEConfig
model_config = MVTCAEConfig(
    latent_dim=20, 
    input_dims = {'mnist' : (1,28,28),'svhn' : (3,32,32)})
model = MVTCAE(model_config)

Define a trainer and train the model !

training_config = BaseTrainerConfig(
    learning_rate=1e-3,
    num_epochs=30
)

trainer = BaseTrainer(
    model=model,
    train_dataset=train_set,
    training_config=training_config,
)
trainer.train()

Table of Contents

Installation

(Back to top)

git clone https://github.com/AgatheSenellart/multimodal_vaes.git
cd multimodal_vaes
pip install .

Usage

(Back to top)

Our library allows you to use any of the models with custom configuration, encoders and decoders architectures and datasets easily. See our tutorial Notebook at /examples/tutorial_notebooks/getting_started.ipynb to easily get the gist of principal features.

Contribute

(Back to top)

If you want to contribute by adding models to the library, clone the repository and install it in editable mode by using the -e option

pip install -e .

Reproducibility statement

All implemented models are validated by reproducing a key result of the paper.

License

(Back to top)