tzteyang

bjtu major in AI

Beijing Jiatong University

tzteyang's Stars

ADaM-BJTU/OpenRFT
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
Language:Python57
huggingface/smol-course
A course on aligning smol models.
Language:Jupyter Notebook3.5k1.1k
HICAI-ZJU/SciKnowEval
SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models
Language:Python162
chenzomi12/AIFoundation
AIFoundation 主要是指AI系统遇到大模型，从底层到上层如何系统级地支持大模型训练和推理，全栈的核心技术。
Language:Python48159
ADaM-BJTU/O1-CODER
AN O1 REPLICATION FOR CODING
Language:Python27017
openai/spinningup
An educational resource to help anyone learn deep reinforcement learning.
Language:Python10.3k2.2k
plageon/SlimPlm
Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs
Language:Python505
gomate-community/TrustRAG
TrustRAG：The RAG Framework within Reliable input,Trusted output
Language:Python59252
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python32.6k5k
waydabber/BetterDisplay
Unlock your displays on your Mac! Flexible HiDPI scaling, XDR/HDR extra brightness, virtual screens, DDC control, extra dimming, PIP/streaming, EDID override and lots more!
21.4k375
unslothai/unsloth
Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
Language:Python19.7k1.4k
microsoft/DeepSpeedExamples
Example models using DeepSpeed
Language:Python6.2k1.1k
Freder-chen/ReasonGenRM
A simple implementation of ReasonGenRM.
Language:Python3
datawhalechina/tiny-universe
《大模型白盒子构建指南》：一个全手搓的Tiny-Universe
Language:Python1.8k178
openreasoner/openr
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Language:Python1.3k106
huggingface/trl
Train transformer language models with reinforcement learning.
Language:Python10.4k1.3k
opendilab/awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
3.6k217
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
5.9k322
quchangle1/LLM-Tool-Survey
This is the repository for the Tool Learning survey.
27812
maitrix-org/llm-reasoners
A library for advanced large language model reasoning
Language:Python1.6k138
karpathy/LLM101n
LLM101n: Let's build a Storyteller
30.6k1.7k
HarlynDN/WebCiteS
[ACL'24] WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations
Language:Python123
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Language:Python3.4k313
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python27.6k3.2k
chanchimin/RQ-RAG
Codes for our paper "RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation"
Language:Python14621
imoneoi/openchat
OpenChat: Advancing Open-source Language Models with Imperfect Data
Language:Python5.3k401
THUNLP-MT/StableToolBench
A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.
Language:Python12215
AviSoori1x/makeMoE
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
Language:Jupyter Notebook60464
AGI-Edgerunners/LLM-Agents-Papers
A repo lists papers related to LLM based agent
Language:Python1.2k76
tatsu-lab/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Language:Python29.7k4.1k

tzteyang

tzteyang's Stars

ADaM-BJTU/OpenRFT

huggingface/smol-course

HICAI-ZJU/SciKnowEval

chenzomi12/AIFoundation

ADaM-BJTU/O1-CODER

openai/spinningup

plageon/SlimPlm

gomate-community/TrustRAG

vllm-project/vllm

waydabber/BetterDisplay

unslothai/unsloth

microsoft/DeepSpeedExamples

Freder-chen/ReasonGenRM

datawhalechina/tiny-universe

openreasoner/openr

huggingface/trl

opendilab/awesome-RLHF

hijkzzz/Awesome-LLM-Strawberry

quchangle1/LLM-Tool-Survey

maitrix-org/llm-reasoners

karpathy/LLM101n

HarlynDN/WebCiteS

OpenRLHF/OpenRLHF

meta-llama/llama3

chanchimin/RQ-RAG

imoneoi/openchat

THUNLP-MT/StableToolBench

AviSoori1x/makeMoE

AGI-Edgerunners/LLM-Agents-Papers

tatsu-lab/stanford_alpaca