premAI-io/benchmarks

Additional performance benchmarks metric to give a overall picture of choosing a backend / framework.

Anindyadeep opened this issue · 2 comments

Along with tokens/sec and memory consumption (mentioned in issue #106) it would be great if we can put at least One most popular benchmark result (example: MMLU), so that users can understand the tradeoff between

  1. performance degradation
  2. speed
  3. memory consumption

Including all these key metrics might able to provide an overall picture and understand the potential tradeoffs.

Here is a good reference for all the sorts of metrics required:

https://github.com/huggingface/optimum/tree/main/tests/benchmark#gptq-benchmark

Closing this issue, since memory profiling support is provided by PR #160

cc: @nsosio