hank0316

MS student from National Taiwan University.

hank0316's Stars

ysymyth/awesome-language-agents
List of language agents based on paper "Cognitive Architectures for Language Agents"
Language:TeX86264
huggingface/open-r1
Fully open reproduction of DeepSeek-R1
Language:Python20.1k1.7k
allenai/olmes
Reproducible, flexible LLM evaluations
Language:Python15914
yizhongw/truthfulqa_reeval
Language:Python103
MiuLab/FactAlign
Source code of our EMNLP 2024 paper "FactAlign: Long-form Factuality Alignment of Large Language Models"
Language:Jupyter Notebook161
JayZhang42/SLED
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433
Language:Python211
openai/simple-evals
Language:Python2.3k204
MiuLab/DogeRM
The code used in the paper "DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging"
Language:Python4
facebookresearch/lingua
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Language:Python4.4k239
Oxen-AI/Self-Rewarding-Language-Models
This is work done by the Oxen.ai Community, trying to reproduce the Self-Rewarding Language Model paper from MetaAI.
Language:Python1179
princeton-nlp/SimPO
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
Language:Python81955
sail-sg/dice
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
Language:Python423
argilla-io/distilabel
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Language:Python2.4k175
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Language:Python40.6k5k
codefuse-ai/Awesome-Code-LLM
[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.
2.1k134
bigcode-project/bigcode-evaluation-harness
A framework for the evaluation of autoregressive code generation language models.
Language:Python885229
atfortes/Awesome-LLM-Reasoning
Reasoning in LLMs: Papers and Resources, including Chain-of-Thought, OpenAI o1, and DeepSeek-R1 🍓
2.5k144
d223302/Over-Reasoning-of-LLMs
Data and code for EACL'24 paper: Over-Reasoning and Redundant Calculation of Large Language Models
Language:Python9
maitrix-org/llm-reasoners
A library for advanced large language model reasoning
Language:Python1.9k166
OpenBMB/Eurus
Language:Python30414
RLHFlow/RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
Language:Python1.2k83
facebookresearch/rlfh-gen-div
This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity
Language:Python407
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python38.1k5.7k
meta-llama/llama-cookbook
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
Language:Jupyter Notebook16.2k2.3k
meta-math/MetaMath
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Language:Python40639
GAIR-NLP/ReasonEval
[AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy
Language:Python483
allenai/reward-bench
RewardBench: the first evaluation tool for reward models.
Language:Python50458
tatsu-lab/alpaca_farm
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
Language:Python79359
tlc4418/llm_optimization
A repo for RLHF training and BoN over LLMs, with support for reward model ensembles.
Language:Python353
LAION-AI/Open-Assistant
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
Language:Python37.2k3.3k

hank0316

hank0316's Stars

ysymyth/awesome-language-agents

huggingface/open-r1

allenai/olmes

yizhongw/truthfulqa_reeval

MiuLab/FactAlign

JayZhang42/SLED

openai/simple-evals

MiuLab/DogeRM

facebookresearch/lingua

Oxen-AI/Self-Rewarding-Language-Models

princeton-nlp/SimPO

sail-sg/dice

argilla-io/distilabel

hiyouga/LLaMA-Factory

codefuse-ai/Awesome-Code-LLM

bigcode-project/bigcode-evaluation-harness

atfortes/Awesome-LLM-Reasoning

d223302/Over-Reasoning-of-LLMs

maitrix-org/llm-reasoners

OpenBMB/Eurus

RLHFlow/RLHF-Reward-Modeling

facebookresearch/rlfh-gen-div

vllm-project/vllm

meta-llama/llama-cookbook

meta-math/MetaMath

GAIR-NLP/ReasonEval

allenai/reward-bench

tatsu-lab/alpaca_farm

tlc4418/llm_optimization

LAION-AI/Open-Assistant