mixed-precision
There are 35 repositories under mixed-precision topic.
stochasticai/xTuring
Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6
NVIDIA/OpenSeq2Seq
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Azure/MS-AMP
Microsoft Automatic Mixed Precision Library
Zhen-Dong/HAWQ
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
mit-han-lab/haq
[CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision
hellojialee/Improved-Body-Parts
Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation
moritztng/prism
High Resolution Style Transfer in PyTorch with Color Control and Mixed Precision :art:
suvojit-0x55aa/mixed-precision-pytorch
Training with FP16 weights in PyTorch
verificarlo/verificarlo
A tool for debugging and assessing floating point precision and reproducibility.
rickiepark/deep-learning-with-python-2nd
<케라스 창시자에게 배우는 딥러닝 2판> 도서의 코드 저장소
tanyuqian/redco
NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference
HuiResearch/tfbert
基于tensorflow1.x的预训练模型调用,支持单机多卡、梯度累积,XLA加速,混合精度。可灵活训练、验证、预测。
andreped/GradientAccumulator
:dart: Accumulated Gradients for TensorFlow 2
enp1s0/ozIMMU
FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme
Zhen-Dong/BitPack
BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.
thu-nics/ViDiT-Q
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
EEESlab/CMix-NN
CMix-NN: Mixed Low-Precision CNN Library for Memory-Constrained Edge Devices
qleenju/PDPU
PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications
thu-nics/MixDQ
[ECCV24] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
wu-kan/HPL-AI
An implementation of HPL-AI Mixed-Precision Benchmark based on hpl-2.3
tlkh/pycon-sg19-tensorflow-tutorial
PyCon SG 2019 Tutorial: Optimizing TensorFlow Performance
Andras7/gpt2-pytorch
Extremely simple and understandable GPT2 implementation with minor tweaks
at-aaims/OpenMxP
This is the open source version of HPL-MXP. The code performance has been verified on Frontier
kentaroy47/pytorch-cifar10-fp16
Let's train CIFAR 10 Pytorch with Half-Precision!
sayakpaul/Mixed-Precision-Training-in-tf.keras-2.0
This repository contains notebooks showing how to perform mixed precision training in tf.keras 2.0
enp1s0/cuMpSGEMM
Fast SGEMM emulation on Tensor Cores
lnugraha/CG-Mixed-Precision
Hybrid-Precision Analysis on CG Solver (H.A.C.S). Merging single and double precision to generate a fast yet accurate CG solver
mfuntowicz/RNet
PyTorch RNet implementation with Distributed and Mixed-Precision training support.
AkashSDas/cassava-leaf-disease-classification
Deep learning solution for Cassava Leaf Disease Classification, a Kaggle's Research Code Competition using Tensorflow.
zjykzj/YOLOv1
[CVPR 2016]You Only Look Once: Unified, Real-Time Object Detection
Ahmad-Shawahna/FxP-QNet
A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation for Efficient Hardware Acceleration on Edge Devices
hinofafa/torch_accelerator
Experiments to accelerate GPU device for PyTorch training
Behradsadeghi/AlzMRI-Net
AlzMRI-Net: Classify Alzheimer's stages from MRI scans.
Behradsadeghi/flower-classification-efficientnet
Classifying images of flowers into 17 categories using EfficientNet-B0 and PyTorch.
Behradsadeghi/PotholeSegmentation
Pothole image segmentation using YOLOv9.