1JZER's Stars
LeslieTrue/SFTvsRL
Official implementation of paper: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
huggingface/open-r1
Fully open reproduction of DeepSeek-R1
istio/istio
Connect, secure, control, and observe services.
unslothai/unsloth
Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
haoel/haoel.github.io
Jiayi-Pan/TinyZero
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
EvolvingLMMs-Lab/open-r1-multimodal
A fork to add multimodal model training to open-r1
openai/lm-human-preferences
Code for the paper Fine-Tuning Language Models from Human Preferences
Marker-Inc-Korea/AutoRAG
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
FreedomIntelligence/Evaluation-of-ChatGPT-on-Information-Extraction
An Evaluation of ChatGPT on Information Extraction task, including Named Entity Recognition (NER), Relation Extraction (RE), Event Extraction (EE) and Aspect-based Sentiment Analysis (ABSA).
OneSizeFitsQuorum/MIT6.824-2021
4 labs + 2 challenges + 4 docs
wangzhengquan/MIT6.824
avelino/awesome-go
A curated list of awesome Go frameworks, libraries and software
gorilla/websocket
Package gorilla/websocket is a fast, well-tested and widely used WebSocket implementation for Go.
golangci/golangci-lint
Fast linters runner for Go
istio/community
Istio governance material.
romkatv/powerlevel10k
A Zsh theme
golang-standards/project-layout
Standard Go Project Layout
meta-llama/llama
Inference code for Llama models
Tongji-KGLLM/RAG-Survey
AIoT-MLSys-Lab/Efficient-LLMs-Survey
[TMLR 2024] Efficient Large Language Models: A Survey
FreedomIntelligence/OVM
opendilab/awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
xtaci/kcptun
A Quantum-Safe Secure Tunnel based on QPP, KCP, FEC, and N:M multiplexing.
wenge-research/YAYI-UIE
雅意信息抽取大模型:在百万级人工构造的高质量信息抽取数据上进行指令微调,由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)
fatedier/frp
A fast reverse proxy to help you expose a local server behind a NAT or firewall to the internet.
datawhalechina/easy-rl
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
huggingface/trl
Train transformer language models with reinforcement learning.
jasonvanf/llama-trl
LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA