huggingface/optimum-habana
Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)
PythonApache-2.0
Issues
- 1
Error when running chatglm3_6b: NotImplementedError: Unknown device for graph fuser
#1477 opened by BaideBear - 9
Pretrain with LLama Model - num_samples=0 Error
#1396 opened by saisuryateja1436 - 2
RuntimeError: [Rank:0] FATAL ERROR :: MODULE:PT_DEVMEM Allocation failed for size::234881024 (224)MB #2
#1469 opened by James-Lu-none - 1
Error when running llama2_fine_tuning_inference & Intel_Gaudi_Fine_Tuning examples
#1467 opened by epage480 - 1
Could not find a version that satisfies the requirement networkx==3.3 in Python3.9
#1458 opened by James-Lu-none - 2
Qwen1.5-14B finetune error
#1336 opened by Zjq9409 - 4
- 3
- 1
- 1
Performance for summarization task on BART is low after latest Transformer 4.40 upgrade
#1144 opened by astachowiczhabana - 1
Heavy IO in multi-node example
#1152 opened by rofinn - 4
Tokenizer error: eos_token_id not found error - Incorrect assignment of variables
#1326 opened by premmotgi - 6
Llama 3.1 Support -- Rope_scaling issue
#1154 opened by AzeezIsh - 1
Not able to get good performance for diffusion models when doing single image inference with batch size 1
#1195 opened by basantaxpatra - 1
- 1
ValidateSyncInputTensors tensor_data is empty
#1241 opened by xinsu626 - 1
- 1
quantization FP8 error
#1438 opened by aitss2017 - 2
bigscience / bloomz-7b1 finetune error
#1426 opened by 11989890 - 7
- 2
transformers_future: contrastive search failing with Incompatible input shapes, broadcast not possible.
#1385 opened by vidyasiv - 6
- 1
- 1
- 1
Data parallelism or tensor parallelism? How can i know that and is there a chance i can shift in between these too?
#1242 opened by venkycreator - 2
Info needed about stable diffusion 3 support
#1127 opened by dkiran1 - 2
does the optimum-habana support sdxl controlnet image to image pipeline?
#1103 opened by basantaxpatra - 1
Is there example of FP8 train LLM, pre-train or fine-tune
#1073 opened by harborn - 2
AttributeError: module 'transformers.generation.stopping_criteria' has no attribute 'MaxNewTokensCriteria'
#1372 opened by hemanthkotaprolu - 7
- 4
Dataset v3.0.0 deprecates tasks and cause CI failures
#1341 opened by vidyasiv - 5
Flash attention not supported in run_clm.py
#1318 opened by aitss2017 - 6
RuntimeError: shape '[-1, 0]' is invalid for input of size 134152192 for Mistral-7B finetune
#1311 opened by aitss2017 - 3
runwayml/stable-diffusion-v1-5 no longer exists
#1305 opened by vidyasiv - 4
CodeGen inference error "synNodeCreateWithId failed for node: batch_gemm with synStatus 26"
#1314 opened by caijimin - 3
stable-diffusion-2-1-base BF16 on Gaudi2D get RuntimeError: synNodeCreateWithId failed for node: spatial_convolution with synStatus 26 [Generic failure].
#1293 opened by KiwiHana - 12
CLIP contrastive image-text inference can't run on hpu with gaudi-docker/1.17.0
#1216 opened by caijimin - 4
AWQ is not working
#1240 opened by endomorphosis - 6
Qwen2-72B inference on 8x Gaudi2 gets OOM issue due to missing meta-device support on model loading
#1112 opened by LeoZhao-Intel - 6
Quantization failed
#1237 opened by endomorphosis - 1
CLIP contrastive image-text inference error
#1179 opened by caijimin - 2
Batch size beyond 16 is throwing an error
#1177 opened by venkycreator - 2
llava inference works incorrectly if adapt_transformers_to_gaudi called after transformers import
#1176 opened by mattolson93 - 2
The name of the "is_greedy_or_beam_and_bucket" variable does not seem to match the logic
#1166 opened by Jing1Ling - 3
TypeError: GaudiPhiForCausalLM.forward() got an unexpected keyword argument 'reuse_cache'
#1036 opened by eduand-alvarez - 1
- 25
Unable to run protein-folding example with the latest release: optimum-habana-1.11.1
#1014 opened by mgujral - 1
--report_to tensorboard not working on multiple HPUs
#1021 opened by 12010486 - 4
- 0
Add support on Whisper inference on HPU
#1047 opened by Spycsh