I'm trying to build a GPTQ quantizer in MPS as a side project to get more understanding about both LLM quantization and Metal programming. The end goal is to quantize a small model in an M3 MacBook and then train it there too.
This is and will be in WIP for some time, so expect to see lots of commented code, empty files and whatnot.