Additional performance benchmarks metric to give a overall picture of choosing a backend / framework.

Question

Additional performance benchmarks metric to give a overall picture of choosing a backend / framework.

Anindyadeep opened this issue a year ago · 2 comments

Along with tokens/sec and memory consumption (mentioned in issue #106) it would be great if we can put at least One most popular benchmark result (example: MMLU), so that users can understand the tradeoff between

performance degradation
speed
memory consumption

Including all these key metrics might able to provide an overall picture and understand the potential tradeoffs.

Answer 1 · 2024-01-26T07:01:57.000Z

Here is a good reference for all the sorts of metrics required:

https://github.com/huggingface/optimum/tree/main/tests/benchmark#gptq-benchmark

Answer 2 · 2024-04-13T18:22:08.000Z

Closing this issue, since memory profiling support is provided by PR #160

cc: @nsosio