/Quantization-Fundamentals-with-Hugging-Face

Learn linear quantization techniques using the Quanto library and downcasting methods with the Transformers library to compress and optimize generative AI models effectively.

Primary LanguageJupyter Notebook

Stargazers