fp8

There are 8 repositories under fp8 topic.

NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.
Language:Python2.7k 36 523502
Azure/MS-AMP
Microsoft Automatic Mixed Precision Library
Language:Python590 11 6948
intel/neural-speed
An innovative library for efficient LLM inference via low-bit quantization
Language:C++348 7 4738
aredden/flux-fp8-api
Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.
Language:Python258 5 3334
graphcore-research/jax-scalify
JAX Scalify: end-to-end scaled arithmetics
Language:Python16 1 230
klessydra/spike-with-minifloat-fp8-support
Spike, a RISC-V ISA Simulator with added 8-bit vector floating point support
Language:C1 0 0
zsxkib/cog-step-video-t2v
Cog Single GPU Quantized Implementation of Step-Video-T2V
Language:Python1
umangyadav/py_fp8
FP8 dtypes enumeration in python
Language:C++0 1 00