Jzz24's Stars
ggerganov/llama.cpp
LLM inference in C/C++
halfrost/LeetCode-Go
✅ Solutions to LeetCode by Go, 100% test coverage, runtime beats 100% / LeetCode 题解
NVIDIA/TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Megvii-BaseDetection/YOLOX
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
THUDM/GLM-130B
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
harvardnlp/annotated-transformer
An annotated implementation of the Transformer paper.
TimDettmers/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
kamyu104/LeetCode-Solutions
🏋️ Python / Modern C++ Solutions of All 3415 LeetCode Problems (Weekly Update)
open-mmlab/mmaction2
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
PINTO0309/PINTO_model_zoo
A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8), EdgeTPU, CoreML.
qwopqwop200/GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
D-X-Y/Awesome-AutoDL
Automated Deep Learning: Neural Architecture Search Is Not the End (a curated list of AutoDL resources and an in-depth analysis)
IST-DASLab/gptq
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
htqin/awesome-model-quantization
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
SwinTransformer/Swin-Transformer-Object-Detection
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.
openppl-public/ppq
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
openppl-public/ppl.nn
A primitive library for neural network
ModelTC/MQBench
Model Quantization Benchmark
Jermmy/pytorch-quantization-demo
A simple network quantization demo using pytorch from scratch.
openppl-public/ppl.cv
ppl.cv is a high-performance image processing library of openPPL supporting various platforms.
megvii-research/FQ-ViT
[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
yhhhli/BRECQ
Pytorch implementation of BRECQ, ICLR 2021
AI-performance/embedded-ai.bench
benchmark for embededded-ai deep learning inference engines, such as NCNN / TNN / MNN / TensorFlow Lite etc.
ucbrise/actnn
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
openppl-public/CuAssembler
An unofficial cuda assembler, for all generations of SASS, hopefully :)
sony-si/ai-research
gilshm/sparq
Post-training sparsity-aware quantization
openppl-public/ppl.common
Common libraries for PPL projects
openppl-public/ppq_tools