facebookresearch/multimodal
TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
PythonBSD-3-Clause
Issues
- 0
Support Python 3.10 at least
#528 opened by mjspeck - 5
Cannot fine-tune CLIP model in GPU
#525 opened by LioYao - 7
Add support for LLaVA model
#482 opened by youssefadr - 1
CoCa model implementation
#517 opened by seungkyuK - 1
mini-imageNet
#515 opened by HeYiyang2 - 4
Support for CoCa Model
#414 opened by mert-kurttutan - 2
Train diffusion on MNIST
#505 opened by sudhir2016 - 4
- 3
Training log file for flava full
#421 opened by rishabhm12 - 1
OOM while finetuning flava
#420 opened by rishabhm12 - 3
[DOCUMENTATION] Fix FLAVA example's link in the page Introducing Trochmultimodal
#407 opened by alcazar90 - 2
ALBEF: Train from scratch
#386 opened by XinhaoMei - 1
- 1
Use CLIP models with pretrained weights
#384 opened by konradkalita - 2
[FLAVA]Can't Access ImageNet
#394 opened by anyduoshuo - 1
- 2
Incremental addition of the new modality
#390 opened by averkij - 3
Clip model sample training code
#383 opened by ShahabMokari - 6
Linear probing on vision tasks
#378 opened by KMnP - 3
Albef model dataset & caption file
#376 opened by techthiyanes - 2
Fine-tuning and scaling up blog post?
#381 opened by sayakpaul - 1
- 1
Deprecate PretrainedMixin
#153 opened by YosuaMichael - 0
Add iteration strategy to multidatamodule
#76 opened by ankitade - 0
`to_2tuple` is defined in multiple locations
#313 opened by joecummings - 3
Would it be possible to post the flava results just so we know we reproduced it right?
#78 opened by slerman12 - 0
GPU Tests Failed with AttributeError: Can't pickle local object 'ArgumentParser.__init__.<locals>.identity'
#17 opened by langong347 - 1
- 3
Fine-tuning flava model
#159 opened by ans92 - 0
- 1
README links are broken
#165 opened by mrahtz - 1
- 9
- 2
How to use ImageNet in FLAVA?
#120 opened by Phoebe-ovo - 1
- 1
Latency in picking up upstream changes from TorchText TransformerEncoder temporarily broke CLIPTextEncoder
#35 opened by langong347 - 2
Potential bug in CLIP transform implementation
#32 opened by parmeet - 2
- 1
TorchMultimodal requires Python >=3.8
#7 opened by laurencer