tzteyang's Stars
ADaM-BJTU/OpenRFT
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
huggingface/smol-course
A course on aligning smol models.
HICAI-ZJU/SciKnowEval
SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models
chenzomi12/AIFoundation
AIFoundation 主要是指AI系统遇到大模型,从底层到上层如何系统级地支持大模型训练和推理,全栈的核心技术。
ADaM-BJTU/O1-CODER
AN O1 REPLICATION FOR CODING
openai/spinningup
An educational resource to help anyone learn deep reinforcement learning.
plageon/SlimPlm
Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs
gomate-community/TrustRAG
TrustRAG:The RAG Framework within Reliable input,Trusted output
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
waydabber/BetterDisplay
Unlock your displays on your Mac! Flexible HiDPI scaling, XDR/HDR extra brightness, virtual screens, DDC control, extra dimming, PIP/streaming, EDID override and lots more!
unslothai/unsloth
Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
microsoft/DeepSpeedExamples
Example models using DeepSpeed
Freder-chen/ReasonGenRM
A simple implementation of ReasonGenRM.
datawhalechina/tiny-universe
《大模型白盒子构建指南》:一个全手搓的Tiny-Universe
openreasoner/openr
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
huggingface/trl
Train transformer language models with reinforcement learning.
opendilab/awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
quchangle1/LLM-Tool-Survey
This is the repository for the Tool Learning survey.
maitrix-org/llm-reasoners
A library for advanced large language model reasoning
karpathy/LLM101n
LLM101n: Let's build a Storyteller
HarlynDN/WebCiteS
[ACL'24] WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
meta-llama/llama3
The official Meta Llama 3 GitHub site
chanchimin/RQ-RAG
Codes for our paper "RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation"
imoneoi/openchat
OpenChat: Advancing Open-source Language Models with Imperfect Data
THUNLP-MT/StableToolBench
A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.
AviSoori1x/makeMoE
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
AGI-Edgerunners/LLM-Agents-Papers
A repo lists papers related to LLM based agent
tatsu-lab/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.