chouxianyu

happy everyday!

chouxianyu's Stars

Vahe1994/SpQR
Language:Python52542
NVIDIA/TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Language:C++10.6k2.1k
NVIDIA/TensorRT-Model-Optimizer
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, sparsity, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
Language:Python45528
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
2.6k173
liuyubing233/zhihu-custom
知乎修改器，目前适用于Tampermonkey，主要功能：页面模块自定义隐藏；列表及回答内容过滤；浏览内容历史记录；推荐页内容缓存；列表种类和关键词强过滤，自动调用「不感兴趣」接口；屏蔽用户回答；回答视频下载；回答内容按照点赞数和评论数排序；设置自动收起所有长回答或自动展开所有回答；移除登录提示弹窗；设置过滤故事档案局和盐选科普回答等知乎官方账号回答；手动调节文字大小；切换主题，夜间模式调整；隐藏知乎热搜，体验纯净搜索；列表添加标签种类；去除广告；设置购买链接显示方式；收藏夹内容导出为 PDF；一键移除所有屏蔽选项；外链直接打开；更多功能请在插件里体验...
Language:JavaScript25015
ahmetbersoz/chatgpt-prompts-for-academic-writing
This list of writing prompts covers a range of topics and tasks, including brainstorming research ideas, improving language and style, conducting literature reviews, and developing research plans.
2.8k243
Alvin9999/new-pac
翻墙-科学上网、自由上网、免费科学上网、免费翻墙、油管youtube、fanqiang、软件、VPN、一键翻墙浏览器，vps一键搭建翻墙服务器脚本/教程，免费shadowsocks/ss/ssr/v2ray/goflyway账号/节点，翻墙梯子，电脑、手机、iOS、安卓、windows、Mac、Linux、路由器翻墙、科学上网、youtube视频下载、美区apple id共享账号
54.7k9.3k
ZuodaoTech/everyone-can-use-english
人人都能用英语
Language:TypeScript24.5k3.7k
zuiran/SpliceMix
Language:Python4
graphcore-research/unit-scaling
A library for unit scaling in PyTorch
Language:Jupyter Notebook967
flashinfer-ai/flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda1.2k111
casper-hansen/AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Language:Python1.7k202
Eddie-Wang1120/HPC-Learning-Notes
高性能计算相关知识学习笔记，包含学习笔记和相关知识的代码demo，在持续完善中。如果有帮助的话请Star一下，对作者帮助很大，谢谢！
Language:Jupyter Notebook35933
donnemartin/system-design-primer
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Language:Python271k45.8k
mlabonne/llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Language:Jupyter Notebook37.7k4k
LC044/WeChatMsg
提取微信聊天记录，将其导出成HTML、Word、Excel文档永久保存，对聊天记录进行分析生成年度聊天报告，用聊天数据训练专属于个人的AI聊天助手
Language:Python33.6k3.5k
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.3k931
allegro/allRank
allRank is a framework for training learning-to-rank neural models based on PyTorch.
Language:Python857119
Tencent/NeuralNLP-NeuralClassifier
An Open-source Neural Hierarchical Multi-label Text Classification Toolkit
Language:Python1.8k402
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python27.7k4.1k
neuralmagic/deepsparse
Sparsity-aware deep learning inference runtime for CPUs
Language:Python3k173
vdumoulin/conv_arithmetic
A technical report on convolution arithmetic in the context of deep learning
Language:TeX14k2.3k
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Language:Python4.3k390
thanhkaist/MeanShiftClustering
Implement mean shift cluster from numpy + sklearn + GPU-pytorch
Language:Python11
TannerGilbert/Machine-Learning-Explained
Learn the theory, math and code behind different machine learning algorithms and techniques.
Language:Python6521
Zhen-Dong/Awesome-Quantization-Papers
List of papers related to neural network quantization in recent AI conferences and journals.
42637
bitsandbytes-foundation/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
Language:Python6.1k614
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Language:Python2.4k194
ModelTC/awesome-lm-system
Summary of system papers/frameworks/codes/tools on training or serving large model
565
SqueezeAILab/SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Language:Python63242

chouxianyu

chouxianyu's Stars

Vahe1994/SpQR

NVIDIA/TensorRT

NVIDIA/TensorRT-Model-Optimizer

DefTruth/Awesome-LLM-Inference

liuyubing233/zhihu-custom

ahmetbersoz/chatgpt-prompts-for-academic-writing

Alvin9999/new-pac

ZuodaoTech/everyone-can-use-english

zuiran/SpliceMix

graphcore-research/unit-scaling

flashinfer-ai/flashinfer

casper-hansen/AutoAWQ

Eddie-Wang1120/HPC-Learning-Notes

donnemartin/system-design-primer

mlabonne/llm-course

LC044/WeChatMsg

NVIDIA/TensorRT-LLM

allegro/allRank

Tencent/NeuralNLP-NeuralClassifier

vllm-project/vllm

neuralmagic/deepsparse

vdumoulin/conv_arithmetic

InternLM/lmdeploy

thanhkaist/MeanShiftClustering

TannerGilbert/Machine-Learning-Explained

Zhen-Dong/Awesome-Quantization-Papers

bitsandbytes-foundation/bitsandbytes

ModelTC/lightllm

ModelTC/awesome-lm-system

SqueezeAILab/SqueezeLLM