huggingface/optimum-benchmark

A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.

PythonApache-2.0

Issues

Onnxruntime Seq2Seq doesn't work
#180 opened 2 months ago by Knzaytsev
3
PID does not always reflect whether a GPU is in use
#88 opened 2 months ago by IlyasMoutawwakil
0
More tests
#95 opened 2 months ago by IlyasMoutawwakil
9
CUDA_VISIBLE_DEVICES aren't working
#176 opened 2 months ago by sashavor
5
"Process PID not found" during "Running memory tracking"
#154 opened 3 months ago by andxalex
15
bnb.4bits error: "ValueError: Blockwise quantization only supports 16/32-bit floats, but got torch.uint8"
#175 opened 2 months ago by lifelongeeek
2
regression testing api
#166 opened 2 months ago by IlyasMoutawwakil
0
Running the training benchmark with timm model produces error
#161 opened 2 months ago by aliabdelkader
2
TensorRT-LLM - how to add support for new model?
#164 opened 2 months ago by pfk-beta
1
Reduction of memory requirements to run benchmarks
#151 opened 3 months ago by franchukpetro
9
Warning on loading quantized model
#158 opened 3 months ago by andxalex
1
CLI tests of the cpu training benchmark with pytorch use the gpu if its available
#159 opened 3 months ago by aliabdelkader
2
Is the `test` data generated by random token?
#153 opened 3 months ago by rui-ren
2
Moving model to one device
#148 opened 3 months ago by pfk-beta
5
Trt llm surport question
#133 opened 3 months ago by lemon-little
9
Does it support LLMs capable of processing ggml, such as llama.cpp?
#137 opened 3 months ago by jhrsya
1
（question）When I use the memory tracking feature on the GPU, I find that my VRAM is reported as 0. Is this normal, and what might be causing it?
#136 opened 3 months ago by WCSY-YG
5
deepspeed call init_process_group error on qwen/bloom models
#146 opened 3 months ago by pfk-beta
1
How to set trt llm backend parameters
#138 opened 3 months ago by Yuchen-Cao
3
How to import and use the quantized model with AutoGPTQ？
#135 opened 3 months ago by jhrsya
4
Getting negative throughput value for large batch sizes
#128 opened 3 months ago by mgiessing
7
TypeError: argument of type 'GPTQConfig' is not iterable
#125 opened 3 months ago by lopozz
4
What other library that optimum-benchmark support other than transformer
#117 opened 4 months ago by L1-M1ng
3
How to use optimum-benchmark for custom testing of my model
#116 opened 3 months ago by WCSY-YG
3
How can I test my local model?
#119 opened 3 months ago by smile2game
1
How to obtain the data from the 'forward' and 'generate' stages?
#126 opened 3 months ago by WCSY-YG
3
Testing Qwen-7B. >>> AttributeError: 'NoneType' object has no attribute 'to_dict'
#120 opened 3 months ago by smile2game
7
Saving results from each process and aggregating distributed output
#92 opened 3 months ago by IlyasMoutawwakil
1
VRAM memory measurements should be process specific
#123 opened 3 months ago by IlyasMoutawwakil
1
Remove `cuda` synchronizations
#121 opened 3 months ago by IlyasMoutawwakil
1
May I ask if there is any method to call a gguf format model and test it？Thanks！
#115 opened 4 months ago by Confetti-lxy
0
Question about your latency graph
#112 opened 4 months ago by dzenilee
1
what can I do when I have ConnectionError Error ; And I want to use my local llama weight ?
#104 opened 5 months ago by cason0126
4
Timm support
#52 opened 5 months ago by IlyasMoutawwakil
0
CPU cores isolation/targetting checks
#40 opened 5 months ago by IlyasMoutawwakil
1
Issue with Colab notebook requiring but never using GPU
#67 opened 5 months ago by Tylersuard
4
How to evaluate a model that already exists locally and hasn't been uploaded yet, "model=?"
#102 opened 5 months ago by WCSY-YG
1
Adding a config for SDXL including ORT fp16/etc optimization
#106 opened 5 months ago by poznano-amd
3
what can i do when model need “trust_remote_code=True”
#103 opened 5 months ago by WCSY-YG
1
Need a detailed definition on forward latency
#101 opened 5 months ago by leocnj
1
Evaluators for specific tasks
#34 opened 9 months ago by IlyasMoutawwakil
5
py3nvml measures reserved and not used memory
#31 opened 6 months ago by fxmarty
4
TP and DP support for inference
#86 opened 6 months ago by IlyasMoutawwakil
1
RuntimeError: microsoft/deberta-large
#65 opened 7 months ago by karthickai
1
CUDA_VISIBLE_DEVICES is not captured by torch
#27 opened 9 months ago by fxmarty
4
TGI support
#49 opened 9 months ago by IlyasMoutawwakil
0
Wrong memory measures with `CUDA_VISIBLE_DEVICES`
#47 opened 9 months ago by IlyasMoutawwakil
0
Support for MacOS with M1/M2 silicon and aarch64
#33 opened 9 months ago by ZexinLi0w0
6
Simulate GPTQ quantization
#44 opened 9 months ago by IlyasMoutawwakil
3
DDP throughput
#29 opened 10 months ago by IlyasMoutawwakil
3