Pinned Repositories
infembed
Find the samples, in the test data, on which your (generative) model makes mistakes.
Liger-Kernel
Efficient Triton Kernels for LLM Training
olmes
Reproducible, flexible LLM evaluations
safety-eval-fork
A simple evaluation of generative language models and safety classifiers.
Guide Labs's Repositories
guidelabs/infembed
Find the samples, in the test data, on which your (generative) model makes mistakes.
guidelabs/safety-eval-fork
A simple evaluation of generative language models and safety classifiers.
guidelabs/Liger-Kernel
Efficient Triton Kernels for LLM Training
guidelabs/olmes
Reproducible, flexible LLM evaluations