Pinned Repositories
abel
SOTA Math Opensource LLM
alignment-for-honesty
anole
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
auto-j
Generative Judge for Evaluating Alignment
Entropy-ABF
Official implementation for 'Extending LLMs’ Context Window with 100 Samples'
factool
FacTool: Factuality Detection in Generative AI
MathPile
[NeurlPS D&B 2024] Generative AI for Math: MathPile
OlympicArena
This is the official repository of the paper "OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI"
OpenResearcher
ReAlign
Reformatted Alignment
Generative Artificial Intelligence Research Lab (GAIR)'s Repositories
GAIR-NLP/factool
FacTool: Factuality Detection in Generative AI
GAIR-NLP/anole
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
GAIR-NLP/OpenResearcher
GAIR-NLP/MathPile
[NeurlPS D&B 2024] Generative AI for Math: MathPile
GAIR-NLP/abel
SOTA Math Opensource LLM
GAIR-NLP/auto-j
Generative Judge for Evaluating Alignment
GAIR-NLP/ReAlign
Reformatted Alignment
GAIR-NLP/OlympicArena
This is the official repository of the paper "OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI"
GAIR-NLP/Entropy-ABF
Official implementation for 'Extending LLMs’ Context Window with 100 Samples'
GAIR-NLP/alignment-for-honesty
GAIR-NLP/OPO
GAIR-NLP/weak-to-strong-reasoning
GAIR-NLP/scaleeval
Scalable Meta-Evaluation of LLMs as Evaluators
GAIR-NLP/benbench
Benchmarking Benchmark Leakage in Large Language Models
GAIR-NLP/MetaCritique
Evaluate the Quality of Critique
GAIR-NLP/ReasonEval
Evaluating Mathematical Reasoning Beyond Accuracy
GAIR-NLP/MoPS
[ACL 2024] Code for "MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation"
GAIR-NLP/BeHonest
BeHonest: Benchmarking Honesty in Large Language Models
GAIR-NLP/cs2916
GAIR-NLP/Preference-Dissection
GAIR-NLP/SimulateBench
GPT as Human
GAIR-NLP/Safety-J
Safety-J: Evaluating Safety with Critique
GAIR-NLP/self-improvement-reversal
GAIR-NLP/ChineseFactEval