helloyongyang's Stars
DefTruth/CUDA-Learn-Notes
🎉CUDA/C++ 笔记 / 大模型手撕CUDA / 技术博客,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
ModelTC/llmc
This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
ModelTC/msbench
A tool for model sparse based on torch.fx
NVIDIA/CUDALibrarySamples
CUDA Library Samples
HeKun-NVIDIA/TensorRT-Developer_Guide_in_Chinese
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验。
Tongyi-EconML/FinQwen
FinQwen: 致力于构建一个开放、稳定、高质量的金融大模型项目,基于大模型搭建金融场景智能问答系统,利用开源开放来促进「AI+金融」。
zjunlp/LLMAgentPapers
Must-read Papers on LLM Agents.
liucongg/ChatGLM-Finetuning
基于ChatGLM-6B、ChatGLM2-6B、ChatGLM3-6B模型,进行下游具体任务微调,涉及Freeze、Lora、P-tuning、全参微调等
srush/GPU-Puzzles
Solve puzzles. Learn CUDA.
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
HuangOwen/Awesome-LLM-Compression
Awesome LLM compression research papers and tools.
OpenGVLab/OmniQuant
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
ModelTC/Dipoorlet
Offline Quantization Tools for Deploy.
fpgaminer/GPTQ-triton
GPTQ inference Triton kernel
mbehm/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
THUDM/ChatGLM2-6B
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
agrechnev/trt-cpp-min
TensorRT 7 C++ (almost) minimal examples
dgSPARSE/dgSPARSE-Lib
PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity
CompVis/stable-diffusion
A latent text-to-image diffusion model
OpenPPL/ppq
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
chenyaofo/pytorch-cifar-models
Pretrained models on CIFAR10/100 in PyTorch
bearpaw/pytorch-classification
Classification with PyTorch.
qfgaohao/pytorch-ssd
MobileNetV1, MobileNetV2, VGG based SSD/SSD-lite implementation in Pytorch 1.0 / Pytorch 0.4. Out-of-box support for retraining on Open Images dataset. ONNX and Caffe2 support. Experiment Ideas like CoordConv.
houqb/ssdlite-pytorch-mobilenext
A PyTorch implementation of SSDLite on COCO
yhenon/pytorch-retinanet
Pytorch implementation of RetinaNet object detection.
multimodallearning/pytorch-mask-rcnn
HKUST-Aerial-Robotics/FUEL
An Efficient Framework for Fast UAV Exploration
bzantium/pytorch-admm-pruning
Prune DNN using Alternating Direction Method of Multipliers (ADMM)
HobbitLong/RepDistiller
[ICLR 2020] Contrastive Representation Distillation (CRD), and benchmark of recent knowledge distillation methods