SqueezeAILab
SqueezeAI is part of Berkeley AI Research Lab at UC Berkeley focused on AI Systems research.
Pinned Repositories
KVQuant
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
LLM2LLM
[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
LLMCompiler
[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling
open_source_projects
Open Source Projects from Pallas Lab
SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
TinyAgent
[EMNLP 2024 Demo] TinyAgent: Function Calling at the Edge!
Tool2Vec
Efficient and Scalable Estimation of Tool Representations in Vector Space
SqueezeAILab's Repositories
SqueezeAILab/LLMCompiler
[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling
SqueezeAILab/SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
SqueezeAILab/TinyAgent
[EMNLP 2024 Demo] TinyAgent: Function Calling at the Edge!
SqueezeAILab/KVQuant
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
SqueezeAILab/LLM2LLM
[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
SqueezeAILab/open_source_projects
Open Source Projects from Pallas Lab
SqueezeAILab/Tool2Vec
Efficient and Scalable Estimation of Tool Representations in Vector Space