Additional performance benchmarks metric to give a overall picture of choosing a backend / framework.
Anindyadeep opened this issue · 2 comments
Anindyadeep commented
Along with tokens/sec and memory consumption (mentioned in issue #106) it would be great if we can put at least One most popular benchmark result (example: MMLU), so that users can understand the tradeoff between
- performance degradation
- speed
- memory consumption
Including all these key metrics might able to provide an overall picture and understand the potential tradeoffs.
Anindyadeep commented
Here is a good reference for all the sorts of metrics required:
https://github.com/huggingface/optimum/tree/main/tests/benchmark#gptq-benchmark
Anindyadeep commented