Pinned Repositories
diffusers
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
auto-round
SOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
models
A collection of pre-trained, state-of-the-art models in the ONNX format
neural-compressor
onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
onnxruntime-inference-examples
Examples for using ONNX Runtime for machine learning inferencing.
optimum-habana
Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)
optimum-intel
🤗 Optimum Intel: Accelerate inference with Intel optimization tools
onnx
Open standard for machine learning interoperability
mengniwang95's Repositories
mengniwang95/auto-round
SOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
mengniwang95/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
mengniwang95/models
A collection of pre-trained, state-of-the-art models in the ONNX format
mengniwang95/neural-compressor
mengniwang95/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
mengniwang95/onnxruntime-inference-examples
Examples for using ONNX Runtime for machine learning inferencing.
mengniwang95/optimum-habana
Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)
mengniwang95/optimum-intel
🤗 Optimum Intel: Accelerate inference with Intel optimization tools