huggingface/transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

PythonApache-2.0

Issues

`PreTrainedTokenizerFast._batch_encode_plus()` got an unexpected keyword argument `'split_special_tokens'`
#30685 opened 5 days ago by fahadh4ilyas
3
KV cache with CPU offloading
#30704 opened 21 days ago by n17s
6
LLM inference with static kv-cache example gives different generations depending on the batch examples
#30670 opened 22 days ago by jpiabrantes
9
Starcoder2 has +10% inference latency when flash attention 2 is enabled
#30677 opened 6 days ago by lidingsnyk
4
Whisper assistant decoding not working with pipeline
#30611 opened 6 days ago by kamilakesbi
0
Error while runing T5 trainer: TypeError: argument 'ids': 'list' object cannot be interpreted as an integer
#30712 opened 21 days ago by Aml-Hassan-Abd-El-hamid
3
Error converting from PyTorch to HuggingFace - Mistral / Mixtral
#30641 opened 25 days ago by efenocchi
2
Issue related to dtype with F.conv1d in Whisper evaluation
#30673 opened 23 days ago by moncefbenaicha
4
error when using PPO in Gemma
#30605 opened a month ago by mostafamdy
11
Add Wav2Vec2BertProcessorWithLM
#30671 opened 9 days ago by FredHaa
3
Error while moving model to GPU `NotImplementedError: Cannot copy out of meta tensor; no data!`
#30703 opened 11 days ago by goelayu
6
Evaluate trainer on Code-Switched Speech fails with "ValueError: Multiple languages detected when trying to predict the most likely target language for transcription."
#30654 opened 12 days ago by sproocht
7
Is `model. generate` supported during the training process?
#30713 opened 21 days ago by sunxiaojie99
3
from_pretrained torch_dtype DO NOT affect model buffers
#30709 opened 14 days ago by Chandler-Bing
4
(Have PR) Speed up `BeamScorer` to make GPT-2 generation 2-3x faster
#30647 opened 25 days ago by fzyzcjy
3
default max value of max_new_token
#30666 opened 23 days ago by Navanit-git
6
Error with tf-keras when trying to geneate random seeds
#30711 opened 15 days ago by fabiancpl
2
Setting compute_metrics in Trainer with Idefics2ForConditionalGeneration leads to AttributeError: 'DynamicCache' object has no attribute 'detach'
#30631 opened 15 days ago by EloiEynard
14
Mismatched tensor size error when generating text with beam_search on mps
#30662 opened 23 days ago by zoryzhang
1
Some functional problems in the implementation of Speculative Decoding
#30608 opened a month ago by transcend-0
5
Add static cache support for Whisper
#30707 opened 21 days ago by mobicham
8
TypeError: WhisperForConditionalGeneration.forward() got an unexpected keyword argument 'model'
#30616 opened 20 days ago by kadirnar
5
Question about quantized model with zero3
#30663 opened 23 days ago by mxjmtxrm
2
LLama-3 8B - can't match MMLU performance
#30694 opened 22 days ago by gioaca00
2
Refusal rejection removal as a feature
#30705 opened 21 days ago by KnutJaegersberg
0
Pure Python `PreTrainedTokenizer` is Broken
#30696 opened 22 days ago by daskol
1
Add Prismatic VLMs to Transformers
#30638 opened a month ago by siddk
5
CLIP Training Example Bug - Overfitting
#30682 opened 21 days ago by humanely
1
DDP error with load_best_model_at_end enabled
#30702 opened 21 days ago by zhiyuanhhh
0
Construct a Marian tokenizer. Based on huggingface tokenizers
#30700 opened 21 days ago by RRaphaell
2
`model.safetensors` missing in model file not found error in default case
#30601 opened 21 days ago by davidgxue
0
Cannot save HQQ quantized model.
#30689 opened 22 days ago by mxjmtxrm
6
More memory consumption than litgpt
#30629 opened a month ago by getao
0
Bug with train class method for MobileViTForSemanticSegmentation
#30676 opened 23 days ago by travisddavies
2
Cannot copy out of meta tensor; no data! for SwinV2ForImageClassification
#30661 opened 22 days ago by ethvedbitdesjan
3
FutureWarning about resume_download is raised after huggingface-hub 0.23.0 release
#30618 opened 22 days ago by albertvillanova
0
model_max_length default parameters are missing in transformers>=4.40.0
#30643 opened 23 days ago by helpmefindaname
2
[Phi-3-mini-128k-instruct] Difference in slow and fast tokenization after adding new tokens
#30660 opened 23 days ago by jpmann
2
can't import phi3config etc.
#30659 opened 24 days ago by tsw123678
2
[i18n-<languageCode>] Translating docs to <languageName>
#30665 opened 23 days ago by Ggjkfkg
1
Error During Training with PatchTSMixerForTimeSeriesClassification for Time Series Classification
#30614 opened a month ago by tdg2088
1
Urdu Encoding Issue in Hugging Face Tokenizer
#30636 opened 25 days ago by El-chapo-007
1
DPT implementation contains unused parameters
#30633 opened a month ago by ducha-aiki
4
Wav2Vec2ForCTC weight mismatch
#30628 opened a month ago by MahmoudAshraf97
1
Cannot convert llama 3 model to hf
#30604 opened a month ago by Bedoshady
2
Remove pipelines, chatformatters, templates etc --> Replace with simple generator function / manual string interpolation ---> Just have one standardized way for building datasets and running inference
#30625 opened a month ago by bdytx5
2
HTML Files Keep on Loading
#30626 opened a month ago by IsaacZachary
1
Error During Training with PatchTSMixerForTimeSeriesClassification for Time Series Classification
#30609 opened a month ago by tdg2088
1
Llama3 models causing `TypeError: not a string` error in LlamaTokenizer
#30607 opened a month ago by KeitaW
4
AutoModal how to enable TP for extremly large models?
#30596 opened a month ago by MonolithFoundation
2