huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
PythonApache-2.0
Issues
- 3
Mono-Electra model type not recognised
#30807 opened by PrithivirajDamodaran - 3
- 1
OverflowError: can't convert negative int to unsigned[finetuning XLNet]
#30817 opened by ZHAOFEGNSHUN - 1
Add data2vec 2.0
#30805 opened by formiel - 1
tracker: `generate` composability refactor
#30810 opened by gante - 0
Resuming from checkpoint runs into OOM
#30822 opened by PKlumpp - 0
torchrun breaks with load_model_at_end and with metric_for_best_model=eval_f1 on question_answering example
#30819 opened by godspeed5 - 0
Cannot restore FSDP checkpoint with LOCAL_STATE_DICT
#30811 opened by helloworld1 - 3
[Llava] Phi text model produces `ValueError: Attention mask should be of size (1, 1, 1, 230), but is torch.Size([1, 1, 1, 8])` when using `past_key_values` in generate
#30809 opened by xenova - 1
ValueError: Error 503: {'error': 'Service Unavailable'}
#30816 opened by jeremierostan - 0
`SPMConverter` does not always add the user defined symbol -> slow fast is thus not equivalent
#30824 opened by ArthurZucker - 1
- 1
Enabling timestamps changes text/reduces accuracy
#30815 opened by jaggzh - 2
Serialization error when tokenizer_config key matches function name in PreTrainedTokenizerBase
#30796 opened by avnermay - 6
recent version of Transformers seems to mess with forward/__call__. Breaks patching loss function
#30753 opened by grahamannett - 3
GemmaForCausalLM Causal Masking Not Working
#30813 opened by cmathw - 4
ValueError: You should supply an encoding or a list of encodings to this method that includes input_ids, but you provided []
#30769 opened by gtanya89 - 2
Getting Loss : nan while fine-tuning blip2-opt-2.7b
#30789 opened by tan7vir - 4
BART generate with min_new_tokens exceeds maximum length
#30759 opened by vsocrates - 1
TFSequenceClassificationLoss for MultiLabel classification
#30792 opened by ds-mike - 2
CLAP Fine-tuning has run into a problem
#30795 opened by ScottishFold007 - 1
Update deprecated method in generic.py for compatibility with newer versions of PyTorch
#30798 opened by Haleshot - 3
- 6
use_reentrant=False can't be set properly
#30749 opened by getao - 0
new model request: DeepSeek-V2
#30791 opened by Atry - 5
no_speech_probablity
#30777 opened by rizwanishaq - 1
Failed to Download GPT2-large Model from Hub
#30715 opened by daskol - 3
BitsNBytes 4 bit quantization error message typo and logical errors in error message handling
#30751 opened by jkterry1 - 3
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained
#30762 opened by yingqianch - 7
Bug: InformerModel, decoder_input torch.cat size of tensor mismatch error otherwise
#30750 opened by jhzsquared - 2
- 2
Grounding DINO missing custom kernels
#30765 opened by sam-ulrich1 - 1
Failed to import transformers.models.vit.feature_extraction_vit because of the following error (look up to see its traceback): No module named 'ml_dtypes._custom_floats'
#30756 opened by JJLee2910 - 2
Can the BNB quantization process be on GPU?
#30770 opened by mxjmtxrm - 1
Implement kv cache sparsity like H2O with attention score
#30758 opened by HarryWu99 - 0
- 0
- 0
Convert Helsinki-NLP model to huggingface
#30761 opened by nichellehouston - 0
train_new_from_iterator does not properly modify the tokenizer's postprocessor's ids when using a Sequence postprocessor
#30752 opened by dmcinerney - 2
Support for Multiple Datasets and Domain-Specific Loss Calculation in Trainer
#30725 opened by Ajmalshamsudheen - 12
- 2
Meet problems when I use the file src/transformers/models/llama/convert_llama_weights_to_hf.py to transfer LlaMa-7B
#30734 opened by wwxxyy1996 - 8
AttributeError: 'HQQLinear' object has no attribute 'weight'
#30727 opened by mxjmtxrm - 1
[DOCS] - Model outputs of RecurrentGemmaCausalLM doesn't align with the documentation
#30736 opened by godjw - 1
[Batched Whisper] ValueError on input mel features
#30740 opened by kerem0comert - 5
error when convert llama1 ckpts to hf formath
#30723 opened by a157801 - 1
Disable Progress Bar?
#30733 opened by haok1402 - 3
Add TableTransformerImageProcessor
#30718 opened by NielsRogge - 2
Assisted model doesn't seem to be working for Meta-Llama-3-8B
#30728 opened by jivanph - 0