Pinned Repositories
BIG-bench
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
BIG-bench
Beyond the Imitation Game collaborative benchmark for enormous language models
CalibratedMath
Teaching Models to Express Their Uncertainty in Words
TruthfulQA
TruthfulQA: Measuring How Models Imitate Human Falsehoods
sylinrl's Repositories
sylinrl/TruthfulQA
TruthfulQA: Measuring How Models Imitate Human Falsehoods
sylinrl/CalibratedMath
Teaching Models to Express Their Uncertainty in Words
sylinrl/BIG-bench
Beyond the Imitation Game collaborative benchmark for enormous language models