chouxianyu's Stars
Vahe1994/SpQR
NVIDIA/TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
NVIDIA/TensorRT-Model-Optimizer
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, sparsity, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
liuyubing233/zhihu-custom
知乎修改器,目前适用于Tampermonkey,主要功能:页面模块自定义隐藏;列表及回答内容过滤;浏览内容历史记录;推荐页内容缓存;列表种类和关键词强过滤,自动调用「不感兴趣」接口;屏蔽用户回答;回答视频下载;回答内容按照点赞数和评论数排序;设置自动收起所有长回答或自动展开所有回答;移除登录提示弹窗;设置过滤故事档案局和盐选科普回答等知乎官方账号回答;手动调节文字大小;切换主题,夜间模式调整;隐藏知乎热搜,体验纯净搜索;列表添加标签种类;去除广告;设置购买链接显示方式;收藏夹内容导出为 PDF;一键移除所有屏蔽选项;外链直接打开;更多功能请在插件里体验...
ahmetbersoz/chatgpt-prompts-for-academic-writing
This list of writing prompts covers a range of topics and tasks, including brainstorming research ideas, improving language and style, conducting literature reviews, and developing research plans.
Alvin9999/new-pac
翻墙-科学上网、自由上网、免费科学上网、免费翻墙、油管youtube、fanqiang、软件、VPN、一键翻墙浏览器,vps一键搭建翻墙服务器脚本/教程,免费shadowsocks/ss/ssr/v2ray/goflyway账号/节点,翻墙梯子,电脑、手机、iOS、安卓、windows、Mac、Linux、路由器翻墙、科学上网、youtube视频下载、美区apple id共享账号
ZuodaoTech/everyone-can-use-english
人人都能用英语
zuiran/SpliceMix
graphcore-research/unit-scaling
A library for unit scaling in PyTorch
flashinfer-ai/flashinfer
FlashInfer: Kernel Library for LLM Serving
casper-hansen/AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Eddie-Wang1120/HPC-Learning-Notes
高性能计算相关知识学习笔记,包含学习笔记和相关知识的代码demo,在持续完善中。 如果有帮助的话请Star一下,对作者帮助很大,谢谢!
donnemartin/system-design-primer
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
mlabonne/llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
LC044/WeChatMsg
提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
allegro/allRank
allRank is a framework for training learning-to-rank neural models based on PyTorch.
Tencent/NeuralNLP-NeuralClassifier
An Open-source Neural Hierarchical Multi-label Text Classification Toolkit
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
neuralmagic/deepsparse
Sparsity-aware deep learning inference runtime for CPUs
vdumoulin/conv_arithmetic
A technical report on convolution arithmetic in the context of deep learning
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
thanhkaist/MeanShiftClustering
Implement mean shift cluster from numpy + sklearn + GPU-pytorch
TannerGilbert/Machine-Learning-Explained
Learn the theory, math and code behind different machine learning algorithms and techniques.
Zhen-Dong/Awesome-Quantization-Papers
List of papers related to neural network quantization in recent AI conferences and journals.
bitsandbytes-foundation/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
ModelTC/awesome-lm-system
Summary of system papers/frameworks/codes/tools on training or serving large model
SqueezeAILab/SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization