fp4

There are 7 repositories under fp4 topic.

  • NVIDIA/TransformerEngine

    A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.

    Language:Python2.9k36556540
  • intel/neural-compressor

    SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

    Language:Python2.5k31220281
  • intel/neural-speed

    An innovative library for efficient LLM inference via low-bit quantization

    Language:C++34974739
  • Tencent/AngelSlim

    Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

    Language:Python20042218
  • MurrellGroup/Microfloats.jl

    Narrow precision floating point types

    Language:Julia10
  • mukullokhande99/XR-NPE

    Python implementations for multi-precision quantization in computer vision and sensor fusion workloads, targeting the XR-NPE Mixed-Precision SIMD Neural Processing Engine. The code includes visual inertial odometry (VIO), object classification, and eye gaze extraction code in FP4, FP8, Posit4, Posit8, and BF16 formats.

    Language:Jupyter Notebook2