jason-huh's Stars
huawei-noah/Efficient-NLP
FLHonker/Awesome-Knowledge-Distillation
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
bytedance/effective_transformer
Running BERT without Padding
triton-inference-server/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
cs217/cs217.github.io
Course Webpage for CS 217 Hardware Accelerators for Machine Learning, Stanford University
apache/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
gilshm/sparq
Post-training sparsity-aware quantization
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
t-vi/pytorch-tvmisc
Totally Versatile Miscellanea for Pytorch
microsoft/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
kssteven418/I-BERT
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
kingoflolz/mesh-transformer-jax
Model parallel transformers in JAX and Haiku
amirgholami/ZeroQ
[CVPR'20] ZeroQ: A Novel Zero Shot Quantization Framework
Efficient-ML/Awesome-Model-Quantization
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
karpathy/minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
parachutel/cs224n-stanford-winter2021
Stanford Winter 2021
leehanchung/cs224n
Stanford CS224n: Natural Language Processing with Deep Learning, Winter 2020
rishikksh20/LightSpeech
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
jik876/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
ming024/FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
chester256/Model-Compression-Papers
Papers for deep neural network compression and acceleration
thunlp/PromptPapers
Must-read papers on prompt-based tuning for pre-trained language models.
facebookresearch/LAMA
LAnguage Model Analysis
hiaoxui/soft-prompts
Yeachan-Heo/HSC2021-AlphaSolar
fastai/course-nlp
A Code-First Introduction to NLP course
deeppavlov/DeepPavlov
An open source library for deep learning end-to-end dialog systems and chatbots.
google-research/bert
TensorFlow code and pre-trained models for BERT
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.