LingYeAI's Stars
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
alibaba/rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
intel/MTMC-Temporal-Profiler
wsdjeg/Learn-Vim_zh_cn
聪明地学习Vim
LingYeAI/AdderNetCUDA
An addernet CUDA version.
TouchFishPioneer/SEU-master-thesis
东南大学硕士研究生学位论文LaTeX模板
danielbayevski/PE_for_Eyeriss
dldldlfma/eyeriss_v1
yzhaiustc/Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
arasi15/CNN-Accelerator-Implementation-based-on-Eyerissv2
alibaba/heterogeneity-aware-lowering-and-optimization
heterogeneity-aware-lowering-and-optimization
pConst/basic_verilog
Must-have verilog systemverilog modules
taoyilee/clacc
Deep Learning Accelerator (Convolution Neural Networks)
nvdla/hw
RTL, Cmodel, and testbench for NVDLA
michaeltinsley/awesome-binary-neural-networks
A curated list of binary neural network research papers and software packages.
hpi-xnor/BMXNet-v2-examples
Examples for BMXNet v2 (https://github.com/hpi-xnor/BMXNet-v2)
huawei-noah/Efficient-AI-Backbones
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
Lyken17/pytorch-OpCounter
Count the MACs / FLOPs of your PyTorch model.
huawei-noah/AdderNet
Code for paper " AdderNet: Do We Really Need Multiplications in Deep Learning?"
NVlabs/timeloop
Timeloop performs modeling, mapping and code-generation for tensor algebra workloads on various accelerator architectures.
MingSun-Tse/Efficient-Deep-Learning
Collection of recent methods on (deep) neural network compression and acceleration.
Accelergy-Project/accelergy
Accelergy is an energy estimation infrastructure for accelerator energy estimations
iamhankai/ghostnet.pytorch
[CVPR2020] GhostNet: More Features from Cheap Operations
ultralytics/yolov3
YOLOv3 in PyTorch > ONNX > CoreML > TFLite
pprp/voc2007_for_yolo_torch
:punch: Prepare VOC format datasets for ultralytics/yolov3 & yolov5
HewlettPackard/cacti
An integrated cache and memory access time, cycle time, area, leakage, and dynamic power model
fengbintu/Neural-Networks-on-Silicon
This is originally a collection of papers on neural network accelerators. Now it's more like my selection of research on deep learning and computer architecture.
basicmi/AI-Chip
A list of ICs and IPs for AI, Machine Learning and Deep Learning.
vineodd/PIMSim
PIMSim is a Process-In-Memory Simulator with the compatibility of GEM5 full-system simulation.