model-quantization
There are 15 repositories under model-quantization topic.
htqin/awesome-model-quantization
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
horseee/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
inferflow/inferflow
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
sayakpaul/Adventures-in-TensorFlow-Lite
This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.
htqin/awesome-efficient-aigc
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
RodolfoFerro/psychopathology-fer-assistant
[WINNER! 🏆] Psychopathology FER Assistant. Because mental health matters. My project submission for #TFWorld TF 2.0 Challenge at Devpost.
datawhalechina/awesome-compression
模型压缩的小白入门教程
htqin/BiBench
This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarization.
htqin/QuantSR
This project is the official implementation of our accepted NeurIPS 2023 (spotlight) paper QuantSR: Accurate Low-bit Quantization for Efficient Image Super-Resolution.
seonglae/llama2gptq
Chat to LLaMa 2 that also provides responses with reference documents over vector database. Locally available model using GPTQ 4bit quantization.
nbasyl/OFQ
The official implementation of the ICML 2023 paper OFQ-ViT
HaoranREN/TensorFlow_Model_Quantization
A tutorial of model quantization using TensorFlow
SRDdev/Model-Quantization
Quantization is a technique to reduce the computational and memory costs of running inference by representing the weights and activations with low-precision data types like 8-bit integer (int8) instead of the usual 32-bit floating point (float32).
dslisleedh/NCNet-flax
Unofficial implementation of NCNet using flax and jax