sunying2018

University of ChicagoChicago, IL

sunying2018's Stars

allenai/open-instruct
Language:Python1.7k198
mindspore-lab/mindrl
A high-performance, scalable MindSpore reinforcement learning framework.
Language:Python428
mindspore-lab/mindrlhf
Language:Python2912
haonan3/AnchorContext
AnchorAttention: Improved attention for LLMs long-context training
Language:Python1754
bryanchrist/MathNeuro
Codebase for Math Neurosurgery: Isolating LLMs' Math Reasoning Abilities Using Only Forward Passes
Language:Python8
SimpleBerry/LLaMA-O1
Large Reasoning Models
Language:Python64739
GAIR-NLP/auto-j
Generative Judge for Evaluating Alignment
Language:Python21914
mit-han-lab/duo-attention
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Language:Python38718
XueFuzhao/awesome-mixture-of-experts
A collection of AWESOME things about mixture-of-experts
97674
arpita8/Awesome-Mixture-of-Experts-Papers
Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.
70
OpenCoder-llm/OpenCoder-llm
The Open Cookbook for Top-Tier Code Large Language Model
Language:Python1.3k76
volcengine/verl
veRL: Volcano Engine Reinforcement Learning for LLM
Language:Python34322
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)
Language:Python2.8k260
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Language:Python23.9k2.3k
Tencent/Tencent-Hunyuan-Large
Language:Python1.1k55
wdndev/ai_interview_note
DL & ML & RS
Language:Python45
wdndev/mllm_interview_note
主要记录大语言大模型（LLMs）算法（应用）工程师多模态相关知识
Language:HTML891
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
Language:Python10.7k2.4k
VikParuchuri/marker
Convert PDF to markdown quickly with high accuracy
Language:Python18k1k
epfLLM/Megatron-LLM
distributed trainer for LLMs
Language:Python54577
BatsResearch/bonito
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
Language:Python70446
OpenGVLab/Vision-RWKV
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Language:Python37315
princeton-nlp/ProLong
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"
Language:Python1233
princeton-nlp/HELMET
The HELMET Benchmark
Language:Python769
GAIR-NLP/ProX
Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"
Language:Python19315
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.8k1k
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python6.2k538
lyhue1991/eat_pytorch_in_20_days
Pytorch🍊🍉 is delicious, just eat it! 😋😋
Language:Jupyter Notebook5.3k1.2k
facebookresearch/fastText
Library for fast text representation and classification.
Language:HTML26k4.7k
huggingface/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Language:Python2.1k150

sunying2018

sunying2018's Stars

allenai/open-instruct

mindspore-lab/mindrl

mindspore-lab/mindrlhf

haonan3/AnchorContext

bryanchrist/MathNeuro

SimpleBerry/LLaMA-O1

GAIR-NLP/auto-j

mit-han-lab/duo-attention

XueFuzhao/awesome-mixture-of-experts

arpita8/Awesome-Mixture-of-Experts-Papers

OpenCoder-llm/OpenCoder-llm

volcengine/verl

OpenRLHF/OpenRLHF

infiniflow/ragflow

Tencent/Tencent-Hunyuan-Large

wdndev/ai_interview_note

wdndev/mllm_interview_note

NVIDIA/Megatron-LM

VikParuchuri/marker

epfLLM/Megatron-LLM

BatsResearch/bonito

OpenGVLab/Vision-RWKV

princeton-nlp/ProLong

princeton-nlp/HELMET

GAIR-NLP/ProX

NVIDIA/TensorRT-LLM

sgl-project/sglang

lyhue1991/eat_pytorch_in_20_days

facebookresearch/fastText

huggingface/datatrove