Pinned Repositories
efficient-ai-study
vllm-fork
A high-throughput and memory-efficient inference and serving engine for LLMs
vits
VITS implementation of Japanese, Chinese, Korean, Sanskrit and Thai
.github
mlperf_inference_results_v4.0
owlite
OwLite is a low-code AI model compression toolkit for AI models.
owlite-examples
OwLite Examples repository offers illustrative example codes to help users seamlessly compress PyTorch deep learning models and transform them into TensorRT engines.
QUICK
QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference
vllm-fork
A high-throughput and memory-efficient inference and serving engine for LLMs
vits
VITS implementation of Japanese, Chinese, Korean, Sanskrit and Thai
tae-su-kim's Repositories
tae-su-kim/vits
VITS implementation of Japanese, Chinese, Korean, Sanskrit and Thai