Pipeline for quantizing/compressing LLM's in order to optimize them for deployment.
VectorInstitute/vector-llm-compressor
Pipeline for quantizing/compressing LLM's in order to optimize them for deployment.
PythonApache-2.0
Pipeline for quantizing/compressing LLM's in order to optimize them for deployment.
PythonApache-2.0