mxformat

There are 2 repositories under mxformat topic.

  • intel/neural-compressor

    SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

    Language:Python2.2k34206256
  • intel/neural-speed

    An innovative library for efficient LLM inference via low-bit quantization

    Language:C++34884737