facebookresearch/multimodal

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

PythonBSD-3-Clause

Issues

Support Python 3.10 at least
#528 opened 2 months ago by mjspeck
0
Cannot fine-tune CLIP model in GPU
#525 opened 7 months ago by LioYao
5
Add support for LLaVA model
#482 opened a year ago by youssefadr
7
CoCa model implementation
#517 opened 9 months ago by seungkyuK
1
mini-imageNet
#515 opened a year ago by HeYiyang2
1
Support for CoCa Model
#414 opened 10 months ago by mert-kurttutan
4
Train diffusion on MNIST
#505 opened a year ago by sudhir2016
2
How to perform multimodal multitask instance segmentation in torchmultimodal?
#444 opened a year ago by Nandtiw
4
Training log file for flava full
#421 opened a year ago by rishabhm12
3
OOM while finetuning flava
#420 opened a year ago by rishabhm12
1
[DOCUMENTATION] Fix FLAVA example's link in the page Introducing Trochmultimodal
#407 opened 2 years ago by alcazar90
3
ALBEF: Train from scratch
#386 opened 2 years ago by XinhaoMei
2
training flava with ddp and activation checkpointing gives runtime error
#404 opened 2 years ago by rxqy
1
Use CLIP models with pretrained weights
#384 opened 2 years ago by konradkalita
1
[FLAVA]Can't Access ImageNet
#394 opened 2 years ago by anyduoshuo
2
Tutorial/reference to finetune FLAVA on custom dataset
#392 opened 2 years ago by rishabhm12
1
Incremental addition of the new modality
#390 opened 2 years ago by averkij
2
Clip model sample training code
#383 opened 2 years ago by ShahabMokari
3
Linear probing on vision tasks
#378 opened 2 years ago by KMnP
6
Albef model dataset & caption file
#376 opened 2 years ago by techthiyanes
3
Fine-tuning and scaling up blog post?
#381 opened 2 years ago by sayakpaul
2
Image transform results between HF and our version does not line up
#374 opened 2 years ago by ankitade
1
Deprecate PretrainedMixin
#153 opened 2 years ago by YosuaMichael
1
Add iteration strategy to multidatamodule
#76 opened 2 years ago by ankitade
0
`to_2tuple` is defined in multiple locations
#313 opened 2 years ago by joecummings
0
Would it be possible to post the flava results just so we know we reproduced it right?
#78 opened 2 years ago by slerman12
3
GPU Tests Failed with AttributeError: Can't pickle local object 'ArgumentParser.__init__.<locals>.identity'
#17 opened 2 years ago by langong347
0
'FlavaModelOutput' object has no attribute 'contrastive_logits_per_image'
#183 opened 2 years ago by ans92
1
Fine-tuning flava model
#159 opened 2 years ago by ans92
3
Unimodal evaluation results using models pretrained on ImageNet and CCNews
#157 opened 2 years ago by cuichenx
0
README links are broken
#165 opened 2 years ago by mrahtz
1
FLAVA pretraining docs run into IMAGENET_TAR env variable issue
#160 opened 2 years ago by rohan-varma
1
Can this model be used for duplicate detection from both image and text?
#114 opened 2 years ago by smith-co
9
How to use ImageNet in FLAVA?
#120 opened 2 years ago by Phoebe-ovo
2
AttributeError: 'MultiDataLoader' object has no attribute '__code__'
#27 opened 2 years ago by gihanpanapitiya
1
Latency in picking up upstream changes from TorchText TransformerEncoder temporarily broke CLIPTextEncoder
#35 opened 2 years ago by langong347
1
Potential bug in CLIP transform implementation
#32 opened 2 years ago by parmeet
2
ModuleNotFoundError: No module named 'datasets'
#11 opened 3 years ago by gihanpanapitiya
2
TorchMultimodal requires Python >=3.8
#7 opened 3 years ago by laurencer
1