Issues
- 10
Non-strict loading of the state dict
#278 opened by BenjaminBossan - 0
TypeError: _to_copy() takes from 2 to 3 positional arguments but 4 were given
#289 opened by arseniybelkov - 1
unsupported Microsoft Visual Studio version!
#288 opened by MMundane - 2
- 4
This is not allowed since there's already a kernel registered from python overriding unpack's behavior for CPU dispatch key and quanto_ext namespace.
#285 opened by arseniybelkov - 4
Moving qint4 models takes a large amount of time
#270 opened by gabe56f - 5
- 2
- 2
- 2
Use correct float8 quantization range in MaxOptimizer
#240 opened by dacorvo - 6
Verify extension behaviour in google Colab
#206 opened by dacorvo - 0
- 2
- 0
Support for FP8 Matmuls
#275 opened by maktukmak - 7
Switch to ruff native formatter
#186 opened by dacorvo - 3
Support for new diffuser: flux1.schnell
#272 opened by KoppAlexander - 7
`qint4` failing with PixArt Transformer
#228 opened by sayakpaul - 1
Errors when applied to Lumina-Next
#269 opened by phil329 - 5
optimized kernel for quanto::dqmm not found
#203 opened by kechan - 3
- 4
Inference from a reload quantized open clip model (by .load_state_dict) resulted in IndexError
#217 opened by kechan - 3
Incompatibility with `torch.compile()`
#221 opened by sanchit-gandhi - 5
- 0
Investigate: pack densely scale+shift tensors into the weight tensors for highly quantize tensors
#266 opened by maruel - 0
- 0
Running with optimum-quanto why isn't there a huge reduction in GPU memory?
#265 opened by lonngxiang - 2
VLLM Supported?
#220 opened by RanchiZhao - 2
Error conv2d() received an invalid combination of arguments after quantize the model
#256 opened by KhaoKhao - 3
- 5
fp8 leads to black images (numerical instabilities) for transformer diffusion models
#231 opened by sayakpaul - 11
- 0
- 3
- 4
Why the quantized net is slower?
#184 opened by theguardsgod - 1
Should we stop using quanto without the optimum?
#215 opened by kechan - 1
- 1
Your tool seemed useless?
#233 opened by LumenScopeAI - 2
why missing ?
#224 opened by xalteropsx - 1
- 0
- 1
CUDA Kernel
#214 opened by satabios - 2
- 5
- 5
Unable quantize a single linear layer: throws error: ValueError: Cannot quantize Tensor of shape torch.Size([1, 10]) along axis 0 of size 1
#192 opened by rajat-008 - 2
[Feature Request] FP6 🤗
#189 opened by NicolasMejiaPetit - 0
quanto_cuda.so: cannot open shared object file: No such file or directory
#207 opened by nuclear-missile - 8
Quantized CLIPModel inference not noticeably faster (or even slower) than non quantized
#202 opened by kechan - 7
Got stuck when train resnet50 with QAT
#183 opened by catsled - 3
ValueError: The model is quantized with QuantizationMethod.QUANTO and is not serializable - check out the warnings from the logger on the traceback to understand the reason why the quantized model is not serializable.
#188 opened by gospacedev - 2
[Feature Request] INT16 🤗
#190 opened by duanshengliu