[Feature Request]: Evaluation Metrics

Question

Opened this issue a month ago · 0 comments

Evaluation metrics like f1, precision, recall, EM, fuzzy match?, pass@k and any other ones relevant to our currently supported benchmarks

No response