chillingche

chillingche's Stars

All-Hands-AI/OpenHands
🙌 OpenHands: Code Less, Make More
Language:Python43.2k 329 2k4.8k
ShishirPatil/gorilla
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
Language:Python11.7k 97 2781k
srush/GPU-Puzzles
Solve puzzles. Learn CUDA.
Language:Jupyter Notebook10.3k 170 32888
OpenBMB/MiniCPM
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
Language:Jupyter Notebook7.3k 78 224464
EleutherAI/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
Language:Python7k 128 4511k
HaujetZhao/CapsWriter-Offline
CapsWriter 的离线版，一个好用的 PC 端的语音输入工具
Language:Python3.1k 18 177256
ARM-software/ComputeLibrary
The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.
Language:C++2.9k 230 1.1k785
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Language:Python2.8k 22 191220
HazyResearch/ThunderKittens
Tile primitives for speedy kernels
Language:Cuda1.9k 34 3494
huggingface/nanotron
Minimalistic large language model 3D-parallelism training
Language:Python1.4k 43 93138
facebookresearch/MobileLLM
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
Language:Python1.2k 23 1267
mlfoundations/dclm
DataComp for Language Models
Language:HTML1.2k 37 68110
kvcache-ai/ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Language:Python831 17 6747
IST-DASLab/marlin
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Language:Python680 16 3052
RUC-GSAI/YuLan-Chat
YuLan: An Open-Source Large Language Model
Language:Python594 5 1256
allenai/OLMoE
OLMoE: Open Mixture-of-Experts Language Models
Language:Jupyter Notebook531 10 1139
Cornell-RelaxML/quip-sharp
Language:Python514 12 6844
huggingface/cosmopedia
Language:Python483 12 1245
pcg-mlp/KsanaLLM
Language:C++302 11 2730
MegEngine/MegPeak
Language:C++248 9 1439
bytedance/ABQ-LLM
An acceleration library that supports arbitrary bit-width combinatorial quantization operations
Language:C++241 8 1926
OpenGVLab/EfficientQAT
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Language:Python237 4 2518
quic/qidk
Language:C119 7 5227
n-o-o-n/idp_hexagon
Hexagon processor module for IDA Pro disassembler
Language:C++104 10 1622
tlc-pack/libflash_attn
Standalone Flash Attention v2 kernel without libtorch dependency
Language:C++99 15 513
Cornell-RelaxML/qtip
Language:Python98 11 910
HandH1998/QQQ
QQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.
Language:Python93 5 258
LLM360/amber-data-prep
Data preparation code for Amber 7B LLM
Language:Python84 8 210
quic/efficient-transformers
This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transformers library) into inference-ready formats that run efficiently on Qualcomm Cloud AI 100 accelerators.
Language:Python57 9 836
ise-uiuc/xft
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
Language:Python29 2 42