muse-coder

Chongqing UniversityChongqing

muse-coder's Stars

meta-llama/llama
Inference code for Llama models
Language:Python55.8k 521 9619.5k
tatsu-lab/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Language:Python29.4k 339 2684k
karpathy/llm.c
LLM training in simple, raw C/CUDA
Language:Cuda23.6k 230 1362.6k
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Language:Python19.6k 302 1.4k2.5k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python13.6k 115 1k1.2k
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Language:Python13.6k 101 1k1.1k
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验。
Language:HTML9.4k 81 21918
jadore801120/attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
Language:Python8.8k 96 1812k
adam-maj/tiny-gpu
A minimal GPU design in Verilog to learn how GPUs work from the ground up
Language:SystemVerilog7k 68 23522
bitsandbytes-foundation/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
Language:Python6.1k 50 1k613
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++5.8k 62 625889
NVIDIA-AI-IOT/torch2trt
An easy to use PyTorch to TensorRT converter
Language:Python4.6k 74 723675
AutoGPTQ/AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Language:Python4.4k 30 454467
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Language:Python4.3k 35 1.4k390
turboderp/exllamav2
A fast inference library for running LLMs locally on modern consumer-class GPUs
Language:Python3.5k 33 439272
hyunwoongko/transformer
Transformer: PyTorch Implementation of "Attention Is All You Need"
Language:Python2.9k 8 20423
mit-han-lab/llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Language:Python2.4k 24 169184
IST-DASLab/marlin
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Language:Python575 15 2845
Tlntin/Qwen-TensorRT-LLM
Language:Python575 6 11552
LeiWang1999/ZYNQ-NVDLA
NVDLA (An Opensource DL Accelerator Framework) implementation on FPGA.
Language:Verilog301 8 2862
accel-sim/accel-sim-framework
This is the top-level repository for the Accel-Sim framework.
Language:Python293 10 190113
Yinghan-Li/YHs_Sample
Yinghan's Code Sample
Language:Cuda279 7 453
pigirons/sgemm_hsw
This is an implementation of sgemm_kernel on L1d cache.
Language:Assembly212 7 133
hsharma35/dnnweaver2
Open Source Specialized Computing Stack for Accelerating Deep Neural Networks.
Language:Jupyter Notebook202 16 1673
Guangxuan-Xiao/torch-int
This repository contains integer operators on GPUs for PyTorch.
Language:Python178 2 2148
TRT2022/MST-plus-plus-TensorRT
:poodle: :poodle: :poodle: TensorRT 2022复赛方案：首个基于Transformer的图像重建模型MST++的TensorRT模型推断优化
Language:Python135 2 720
dreamgonfly/transformer-pytorch
A PyTorch implementation of Transformer in "Attention is All You Need"
Language:Python103 4 328
zeasa/nvdla-compiler
Language:Python42 4 415
Sanskar777/QRS-peak-detection-in-ECG-signals-using-verilog
Language:Verilog9 2 02
riple/dnnweaver2.drone
Language:Python2 2 08

muse-coder

muse-coder's Stars

meta-llama/llama

tatsu-lab/stanford_alpaca

karpathy/llm.c

microsoft/unilm

Dao-AILab/flash-attention

QwenLM/Qwen

liguodongiot/llm-action

jadore801120/attention-is-all-you-need-pytorch

adam-maj/tiny-gpu

bitsandbytes-foundation/bitsandbytes

NVIDIA/FasterTransformer

NVIDIA-AI-IOT/torch2trt

AutoGPTQ/AutoGPTQ

InternLM/lmdeploy

turboderp/exllamav2

hyunwoongko/transformer

mit-han-lab/llm-awq

IST-DASLab/marlin

Tlntin/Qwen-TensorRT-LLM

LeiWang1999/ZYNQ-NVDLA

accel-sim/accel-sim-framework

Yinghan-Li/YHs_Sample

pigirons/sgemm_hsw

hsharma35/dnnweaver2

Guangxuan-Xiao/torch-int

TRT2022/MST-plus-plus-TensorRT

dreamgonfly/transformer-pytorch

zeasa/nvdla-compiler

Sanskar777/QRS-peak-detection-in-ECG-signals-using-verilog

riple/dnnweaver2.drone