huggingface/transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

PythonApache-2.0

Issues

Mono-Electra model type not recognised
#30807 opened a day ago by PrithivirajDamodaran
3
Integrate IndicTrans2 models and tokenizer into HF Transformers
#30818 opened 6 hours ago by VarunGumma
3
OverflowError: can't convert negative int to unsigned[finetuning XLNet]
#30817 opened 6 hours ago by ZHAOFEGNSHUN
1
Add data2vec 2.0
#30805 opened a day ago by formiel
1
tracker: `generate` composability refactor
#30810 opened a day ago by gante
1
Resuming from checkpoint runs into OOM
#30822 opened 6 hours ago by PKlumpp
0
torchrun breaks with load_model_at_end and with metric_for_best_model=eval_f1 on question_answering example
#30819 opened 6 hours ago by godspeed5
0
Cannot restore FSDP checkpoint with LOCAL_STATE_DICT
#30811 opened 6 hours ago by helloworld1
0
[Llava] Phi text model produces `ValueError: Attention mask should be of size (1, 1, 1, 230), but is torch.Size([1, 1, 1, 8])` when using `past_key_values` in generate
#30809 opened a day ago by xenova
3
ValueError: Error 503: {'error': 'Service Unavailable'}
#30816 opened 6 hours ago by jeremierostan
1
`SPMConverter` does not always add the user defined symbol -> slow fast is thus not equivalent
#30824 opened 6 hours ago by ArthurZucker
0
Mixtral past_key_values and output_router_logits incompatible
#30731 opened 6 days ago by sorgfresser
1
Enabling timestamps changes text/reduces accuracy
#30815 opened 6 hours ago by jaggzh
1
Serialization error when tokenizer_config key matches function name in PreTrainedTokenizerBase
#30796 opened a day ago by avnermay
2
recent version of Transformers seems to mess with forward/__call__. Breaks patching loss function
#30753 opened 4 days ago by grahamannett
6
GemmaForCausalLM Causal Masking Not Working
#30813 opened 6 hours ago by cmathw
3
ValueError: You should supply an encoding or a list of encodings to this method that includes input_ids, but you provided []
#30769 opened 2 days ago by gtanya89
4
Getting Loss : nan while fine-tuning blip2-opt-2.7b
#30789 opened a day ago by tan7vir
2
BART generate with min_new_tokens exceeds maximum length
#30759 opened a day ago by vsocrates
4
TFSequenceClassificationLoss for MultiLabel classification
#30792 opened a day ago by ds-mike
1
CLAP Fine-tuning has run into a problem
#30795 opened a day ago by ScottishFold007
2
Update deprecated method in generic.py for compatibility with newer versions of PyTorch
#30798 opened a day ago by Haleshot
1
TypeError: 'list' object is not callable || Resume from checkpoint
#30754 opened a day ago by satpalsr
3
use_reentrant=False can't be set properly
#30749 opened 4 days ago by getao
6
new model request: DeepSeek-V2
#30791 opened a day ago by Atry
0
no_speech_probablity
#30777 opened a day ago by rizwanishaq
5
Failed to Download GPT2-large Model from Hub
#30715 opened 7 days ago by daskol
1
BitsNBytes 4 bit quantization error message typo and logical errors in error message handling
#30751 opened 4 days ago by jkterry1
3
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained
#30762 opened 4 days ago by yingqianch
3
Bug: InformerModel, decoder_input torch.cat size of tensor mismatch error otherwise
#30750 opened 2 days ago by jhzsquared
7
TokenClassificationPipeline support is_split_into_words tokeniser parameter
#30757 opened 4 days ago by swtb3
2
Grounding DINO missing custom kernels
#30765 opened 2 days ago by sam-ulrich1
2
Failed to import transformers.models.vit.feature_extraction_vit because of the following error (look up to see its traceback): No module named 'ml_dtypes._custom_floats'
#30756 opened 4 days ago by JJLee2910
1
Can the BNB quantization process be on GPU?
#30770 opened 2 days ago by mxjmtxrm
2
Implement kv cache sparsity like H2O with attention score
#30758 opened 4 days ago by HarryWu99
1
For multiple GPUs: torch.cuda.empty_cache() stuck forever
#30766 opened 3 days ago by animeshkumarpaul
0
Issues occuring during parallel evaluation (using Trainer.evaluate)
#30767 opened 3 days ago by psychocosine
0
Convert Helsinki-NLP model to huggingface
#30761 opened 2 days ago by nichellehouston
0
train_new_from_iterator does not properly modify the tokenizer's postprocessor's ids when using a Sequence postprocessor
#30752 opened 4 days ago by dmcinerney
0
Support for Multiple Datasets and Domain-Specific Loss Calculation in Trainer
#30725 opened 6 days ago by Ajmalshamsudheen
2
CLIPProcessor is not loading the saved Processor of the same version
#30714 opened 4 days ago by humanely
12
Meet problems when I use the file src/transformers/models/llama/convert_llama_weights_to_hf.py to transfer LlaMa-7B
#30734 opened 5 days ago by wwxxyy1996
2
AttributeError: 'HQQLinear' object has no attribute 'weight'
#30727 opened 4 days ago by mxjmtxrm
8
[DOCS] - Model outputs of RecurrentGemmaCausalLM doesn't align with the documentation
#30736 opened 5 days ago by godjw
1
[Batched Whisper] ValueError on input mel features
#30740 opened 5 days ago by kerem0comert
1
error when convert llama1 ckpts to hf formath
#30723 opened 6 days ago by a157801
5
Disable Progress Bar?
#30733 opened 5 days ago by haok1402
1
Add TableTransformerImageProcessor
#30718 opened 6 days ago by NielsRogge
3
Assisted model doesn't seem to be working for Meta-Llama-3-8B
#30728 opened 6 days ago by jivanph
2
`hub_strategy="every_save"` won't push the model to the Hub if large
#30724 opened 6 days ago by alvarobartt
0