kk-machine-learning's Stars
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
SerenityOS/serenity
The Serenity Operating System 🐞
apache/skywalking
APM, Application Performance Monitoring System
brendangregg/FlameGraph
Stack trace visualizer
pwndbg/pwndbg
Exploit Development and Reverse Engineering with GDB Made Easy
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
imoneoi/openchat
OpenChat: Advancing Open-source Language Models with Imperfect Data
deepseek-ai/DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
hellogcc/100-gdb-tips
A collection of gdb tips. 100 maybe just mean many here.
microsoft/CodeBERT
CodeBERT
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
PKU-Alignment/safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
GanjinZero/RRHF
[NIPS2023] RRHF & Wombat
naver/splade
SPLADE: sparse neural search (SIGIR21, SIGIR22)
NVIDIA/NeMo-Aligner
Scalable toolkit for efficient model alignment
salesforce/CodeRL
This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (NeurIPS22).
allenai/OLMoE
OLMoE: Open Mixture-of-Experts Language Models
RLHFlow/Online-RLHF
A recipe for online RLHF and online iterative DPO.
OpenBMB/Eurus
apache/skywalking-rover
Monitor and profiler powered by eBPF to monitor network traffic, and diagnose CPU and network performance.
KiraMelody/nemu
抄nemu的同学点个star好嘛
deepseek-ai/ESFT
Expert Specialized Fine-Tuning
juyongjiang/CodeUp
CodeUp: A Multilingual Code Generation Llama2 Model with Parameter-Efficient Instruction-Tuning on a Single RTX 3090
sl1673495/bytedance-apm-group
字节跳动 APM 团队预备招聘社群,来一起聊聊大厂面试经验、简历如何编写、技术……
imbue-ai/carbs
Cost aware hyperparameter tuning algorithm
sail-sg/sdft
[ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".
louieworth/awesome-rlhf
An index of algorithms for reinforcement learning from human feedback (rlhf))
hkust-nlp/dart-math
[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
wang2226/FOLK
bravikov/parallel-stacks