yukang2017's Stars
xai-org/grok-1
Grok open release
ludwig-ai/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
LargeWorldModel/LWM
allenai/OLMo
Modeling, training, eval, and inference code for OLMo
dvlab-research/MiniGemini
Official implementation for Mini-Gemini
FasterDecoding/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
S-LoRA/S-LoRA
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
XueFuzhao/OpenMoE
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
deepseek-ai/DeepSeek-MoE
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
uclaml/SPIN
The official implementation of Self-Play Fine-Tuning (SPIN)
mayuelala/FollowYourClick
[arXiv 2024] Follow-Your-Click: This repo is the official implementation of "Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts"
horseee/DeepCache
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
jzhang38/EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
SkunkworksAI/hydra-moe
zhuzilin/ring-flash-attention
Ring attention implementation with flash attention
FranxYao/Long-Context-Data-Engineering
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
xingyaoww/code-act
Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji.
VIRL-Platform/VIRL
Code for V-IRL: Grounding Virtual Intelligence in Real Life
HKUNLP/ChunkLlama
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
for-ai/parameter-efficient-moe
hkust-nlp/AgentBoard
An Analytical Evaluation Board of Multi-turn LLM Agents
dyabel/AnyTool
argilla-io/notus
Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach
OSU-NLP-Group/TravelPlanner
[ICML'24] "TravelPlanner: A Benchmark for Real-World Planning with Language Agents"
THUDM/LongAlign
LongAlign: A Recipe for Long Context Alignment Encompassing Data, Training, and Evaluation
jeffreysijuntan/lloco
The official repo for "LLoCo: Learning Long Contexts Offline"
Lucky-Lance/Expert_Sparsity
lixin4ever/CUHK-PHD-Thesis-Template
CUHK PhD Thesis Template
EnVision-Research/DDSM
Denoising Diffusion Step-aware Models (ICLR2024)
FengZicai/LSK3DNet
This is the official implementation of "LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels" (Accepted at CVPR 2024).