Pinned Repositories
LeshengJin's Repositories
LeshengJin/chocopy-wasm-compiler-B
LeshengJin/FastChat
The release repo for "Vicuna: An Open Chatbot Impressing GPT-4"
LeshengJin/mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
LeshengJin/mlc-relax
LeshengJin/models
Models and examples built with TensorFlow
LeshengJin/relax
Temp repo for prototyping relax(relay next), the effort will be upstreamed. We use the wiki pages on this repo to host design docs.
LeshengJin/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
LeshengJin/CTranslate2
Fast inference engine for Transformer models
LeshengJin/faster-whisper
Faster Whisper transcription with CTranslate2
LeshengJin/flashinfer
FlashInfer: Kernel Library for LLM Serving
LeshengJin/libflash_attn
Standalone Flash Attention v2 kernel without libtorch dependency
LeshengJin/llm-perf-bench
LeshengJin/rocm_test
LeshengJin/sglang
SGLang is a fast serving framework for large language models and vision language models.
LeshengJin/Teradata-Smartix
LeshengJin/web-llm
Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.
LeshengJin/whisper-jax
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
LeshengJin/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)