model-quantization

There are 15 repositories under model-quantization topic.

  • htqin/awesome-model-quantization

    A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

  • horseee/Awesome-Efficient-LLM

    A curated list for Efficient Large Language Models

    Language:Python89437167
  • inferflow/inferflow

    Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).

    Language:C++22771621
  • sayakpaul/Adventures-in-TensorFlow-Lite

    This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.

    Language:Jupyter Notebook16812633
  • htqin/awesome-efficient-aigc

    A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

  • RodolfoFerro/psychopathology-fer-assistant

    [WINNER! 🏆] Psychopathology FER Assistant. Because mental health matters. My project submission for #TFWorld TF 2.0 Challenge at Devpost.

    Language:Jupyter Notebook717125
  • datawhalechina/awesome-compression

    模型压缩的小白入门教程

    Language:Jupyter Notebook582110
  • htqin/BiBench

    This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarization.

    Language:Python48233
  • htqin/QuantSR

    This project is the official implementation of our accepted NeurIPS 2023 (spotlight) paper QuantSR: Accurate Low-bit Quantization for Efficient Image Super-Resolution.

    Language:Python35332
  • seonglae/llama2gptq

    Chat to LLaMa 2 that also provides responses with reference documents over vector database. Locally available model using GPTQ 4bit quantization.

    Language:Python30280
  • nbasyl/OFQ

    The official implementation of the ICML 2023 paper OFQ-ViT

    Language:Python26250
  • HaoranREN/TensorFlow_Model_Quantization

    A tutorial of model quantization using TensorFlow

    Language:Python12103
  • Model-Quantization

    SRDdev/Model-Quantization

    Quantization is a technique to reduce the computational and memory costs of running inference by representing the weights and activations with low-precision data types like 8-bit integer (int8) instead of the usual 32-bit floating point (float32).

    Language:Jupyter Notebook30
  • dslisleedh/NCNet-flax

    Unofficial implementation of NCNet using flax and jax

    Language:Python0100