Issues
- 1
❓ [Question] Can't reproduce imagenet results of RN50 model trained on `pixparse/cc3m-wds`
#930 opened by clownrat6 - 1
RuntimeError: Mask shape should match input shape
#954 opened by MINXIANWEN - 1
Intermediate checkpoints with ResNet backbone
#1009 opened by ndrsn0208 - 3
[AssertionError] when trying to inference model
#1011 opened by bright-arparwut - 12
Training stuck on first epoch
#1016 opened by alexisdrakopoulos - 1
error when use try to use int8 operations with OpenCLIP.
#1012 opened by xiaohoua - 1
AttributeError: module 'triton.language' has no attribute 'libdevice', when using method "convert_int8_model_to_inference_mode()"
#1010 opened by bright-arparwut - 1
Dataset's mean and std
#1007 opened by xiaohoua - 9
Cannot train again on pretrained checkpoint due to change in default `weights_only=True`
#998 opened by ishaaq - 0
ZERO-SHOT Classification On SUN397
#1005 opened by xiaohoua - 1
- 0
Where can I get the LAION-80M dataset?
#1003 opened by XxFChen - 1
is it a bug while computing contrastive loss?
#1002 opened by gaofei - 1
ViT-L-14-336 fine-tuning failed
#999 opened by zhaozhipeng1997 - 5
- 1
Fine tune arguments to learn new knowledge without forgetting previous
#973 opened by SutirthaChakraborty - 0
I wonder how to extract patch/local features in 768 dim of CoCa for downstream tasks? Should I use the attn_pool (for caption) to get (256,768)?
#991 opened by Arsiuuu - 1
could you please give me some advice on how to read directory or a lot of tar, thanks!!!
#992 opened by leo23ui - 6
Clarification on using --train-num-samples (lower value) without --dataset-resampled
#993 opened by fadamsyah - 0
Hello, I divided yfcc15m into 16 parts, but dataset download too many files, compression failure , could you please give me some advice on how to solve this!! thanks
#988 opened by leo23ui - 1
Fine-tune ViT Models on Higher Resolution Images
#985 opened by D0miH - 3
Issue with all_gather
#980 opened by scopello - 2
The model and pretraining parameters do not match.
#983 opened by q664171689 - 1
Inference speed
#981 opened by tppqt - 1
SigLIP attention mask
#984 opened by chs20 - 5
- 2
RuntimeError: The shape of the 2D attn_mask is torch.Size([77, 77]), but should be (4, 4)
#937 opened by JaspinXu - 12
Error loading ViT-L-14-quickgelu (metaclip_fullcc) model with version v2.27.0+
#966 opened by aivarasbaranauskas - 2
segmentation fault
#931 opened by vadim0x60 - 2
Fix torch.load weights_only FutureWarning
#928 opened by johnbradley - 8
- 2
Does open_clip support add_tokens?
#961 opened by HyelinNAM - 2
How to subdivide the same category, for example, how to distinguish Persian cats, coffee cats, and jingle cats, which are all cat categories?
#962 opened by watertianyi - 1
Apply scale loss when performing accum_freq
#957 opened by AshStuff - 1
Couldnot find MobileCLIP-S0
#958 opened by SutirthaChakraborty - 1
Question about fine-tuning
#941 opened by jimmyparadm - 0
when I finetune CLIP_ViT_L_14 model , Logit Scale is decrease from 100.0 to 95. and keep going , is right?
#948 opened by Johnson-yue - 0
Separately Optimizing CLIP Image and Text Encoders with Different Loss Functions
#946 opened by omrisuissabrown - 1
- 0
- 0
Fine Tune for emotion
#945 opened by SutirthaChakraborty - 2
Question about the SigLipTokenizer
#940 opened by LuFan31 - 0
Question about the SigLipTokenizer
#939 opened by LuFan31 - 2
- 2
Load model is Error
#934 opened by ROC-Star - 1
Inconsistent performance on pretrained checkpoints of the same architecture from different sources
#936 opened by bdevnani3 - 2
Any Plan for direct support of parquet dataset?
#933 opened by cxxgtxy - 1
- 0
How to fine tune open clip?
#922 opened by capricixhk - 0
How to persist int8 model?
#921 opened by EdenChen233