hank0316's Stars
ysymyth/awesome-language-agents
List of language agents based on paper "Cognitive Architectures for Language Agents"
huggingface/open-r1
Fully open reproduction of DeepSeek-R1
allenai/olmes
Reproducible, flexible LLM evaluations
yizhongw/truthfulqa_reeval
MiuLab/FactAlign
Source code of our EMNLP 2024 paper "FactAlign: Long-form Factuality Alignment of Large Language Models"
JayZhang42/SLED
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433
openai/simple-evals
MiuLab/DogeRM
The code used in the paper "DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging"
facebookresearch/lingua
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Oxen-AI/Self-Rewarding-Language-Models
This is work done by the Oxen.ai Community, trying to reproduce the Self-Rewarding Language Model paper from MetaAI.
princeton-nlp/SimPO
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
sail-sg/dice
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
argilla-io/distilabel
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
codefuse-ai/Awesome-Code-LLM
[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.
bigcode-project/bigcode-evaluation-harness
A framework for the evaluation of autoregressive code generation language models.
atfortes/Awesome-LLM-Reasoning
Reasoning in LLMs: Papers and Resources, including Chain-of-Thought, OpenAI o1, and DeepSeek-R1 🍓
d223302/Over-Reasoning-of-LLMs
Data and code for EACL'24 paper: Over-Reasoning and Redundant Calculation of Large Language Models
maitrix-org/llm-reasoners
A library for advanced large language model reasoning
OpenBMB/Eurus
RLHFlow/RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
facebookresearch/rlfh-gen-div
This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
meta-llama/llama-cookbook
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
meta-math/MetaMath
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
GAIR-NLP/ReasonEval
[AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy
allenai/reward-bench
RewardBench: the first evaluation tool for reward models.
tatsu-lab/alpaca_farm
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
tlc4418/llm_optimization
A repo for RLHF training and BoN over LLMs, with support for reward model ensembles.
LAION-AI/Open-Assistant
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.