Welcome to TurboInference, a high-performance inference toolkit written in C++ for rapid and efficient deployment of machine learning models. This GitHub repository provides a comprehensive set of tools and utilities designed to make inference tasks swift and resource-efficient.
Alexyskoutnev/TurboInference
Welcome to TurboInference, a high-performance inference toolkit written in C++ for rapid and efficient deployment of LLM models. This GitHub repository provides a comprehensive set of tools and utilities designed to make inference tasks swift and resource-efficient.
MIT