LingYeAI

aliyunShanghai, China

LingYeAI's Stars

DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
2.7k182
alibaba/rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
Language:C++52749
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.4k954
intel/MTMC-Temporal-Profiler
Language:Python234
wsdjeg/Learn-Vim_zh_cn
聪明地学习Vim
2.4k301
LingYeAI/AdderNetCUDA
An addernet CUDA version.
Language:Python41
TouchFishPioneer/SEU-master-thesis
东南大学硕士研究生学位论文LaTeX模板
Language:TeX15740
danielbayevski/PE_for_Eyeriss
Language:Verilog41
dldldlfma/eyeriss_v1
Language:Verilog114
yzhaiustc/Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
Language:Cuda27143
arasi15/CNN-Accelerator-Implementation-based-on-Eyerissv2
Language:Verilog8710
alibaba/heterogeneity-aware-lowering-and-optimization
heterogeneity-aware-lowering-and-optimization
Language:C++25276
pConst/basic_verilog
Must-have verilog systemverilog modules
Language:Verilog1.6k376
taoyilee/clacc
Deep Learning Accelerator (Convolution Neural Networks)
Language:Verilog16358
nvdla/hw
RTL, Cmodel, and testbench for NVDLA
Language:Verilog1.7k568
michaeltinsley/awesome-binary-neural-networks
A curated list of binary neural network research papers and software packages.
193
hpi-xnor/BMXNet-v2-examples
Examples for BMXNet v2 (https://github.com/hpi-xnor/BMXNet-v2)
Language:Python92
huawei-noah/Efficient-AI-Backbones
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
Language:Python4k706
Lyken17/pytorch-OpCounter
Count the MACs / FLOPs of your PyTorch model.
Language:Python4.9k528
huawei-noah/AdderNet
Code for paper " AdderNet: Do We Really Need Multiplications in Deep Learning?"
Language:Python953187
NVlabs/timeloop
Timeloop performs modeling, mapping and code-generation for tensor algebra workloads on various accelerator architectures.
Language:C++335101
MingSun-Tse/Efficient-Deep-Learning
Collection of recent methods on (deep) neural network compression and acceleration.
925133
Accelergy-Project/accelergy
Accelergy is an energy estimation infrastructure for accelerator energy estimations
Language:Python12440
iamhankai/ghostnet.pytorch
[CVPR2020] GhostNet: More Features from Cheap Operations
Language:Python522116
ultralytics/yolov3
YOLOv3 in PyTorch > ONNX > CoreML > TFLite
Language:Python10.2k3.4k
pprp/voc2007_for_yolo_torch
:punch: Prepare VOC format datasets for ultralytics/yolov3 & yolov5
Language:Python19657
HewlettPackard/cacti
An integrated cache and memory access time, cycle time, area, leakage, and dynamic power model
Language:C++396136
fengbintu/Neural-Networks-on-Silicon
This is originally a collection of papers on neural network accelerators. Now it's more like my selection of research on deep learning and computer architecture.
1.9k381
basicmi/AI-Chip
A list of ICs and IPs for AI, Machine Learning and Deep Learning.
Language:PHP1.6k274
vineodd/PIMSim
PIMSim is a Process-In-Memory Simulator with the compatibility of GEM5 full-system simulation.
Language:C++18286

LingYeAI

LingYeAI's Stars

DefTruth/Awesome-LLM-Inference

alibaba/rtp-llm

NVIDIA/TensorRT-LLM

intel/MTMC-Temporal-Profiler

wsdjeg/Learn-Vim_zh_cn

LingYeAI/AdderNetCUDA

TouchFishPioneer/SEU-master-thesis

danielbayevski/PE_for_Eyeriss

dldldlfma/eyeriss_v1

yzhaiustc/Optimizing-SGEMM-on-NVIDIA-Turing-GPUs

arasi15/CNN-Accelerator-Implementation-based-on-Eyerissv2

alibaba/heterogeneity-aware-lowering-and-optimization

pConst/basic_verilog

taoyilee/clacc

nvdla/hw

michaeltinsley/awesome-binary-neural-networks

hpi-xnor/BMXNet-v2-examples

huawei-noah/Efficient-AI-Backbones

Lyken17/pytorch-OpCounter

huawei-noah/AdderNet

NVlabs/timeloop

MingSun-Tse/Efficient-Deep-Learning

Accelergy-Project/accelergy

iamhankai/ghostnet.pytorch

ultralytics/yolov3

pprp/voc2007_for_yolo_torch

HewlettPackard/cacti

fengbintu/Neural-Networks-on-Silicon

basicmi/AI-Chip

vineodd/PIMSim