sunggg

OctoML

Pinned Repositories

EAGLE
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Language:Python00
leaderboard-backend
Open sourced backend for Martian's LLM Inference Provider Leaderboard
Language:Python00
mlc-ai-package
Language:Shell00
mlc-dev
Language:Python00
mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
Language:Python00
octoml_relax
A fork of tlc-pack/relax
Language:Python0 0 00
One-Shot-Learning-with-Siamese-Networks
Implementation of One Shot Learning using Convolutional Siamese Networks on Omniglot Dataset
Language:Jupyter Notebook0 0 00
relax
temp repo for prototyping, the effort will be upstreamed
Language:Python0 0 00
SRTuner
SRTuner is a python library that provides efficient auto-tuning building blocks.
Language:Python6 1 02
tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Language:Python0 0 00

sunggg/SRTuner
SRTuner is a python library that provides efficient auto-tuning building blocks.
Language:Python6 1 02
sunggg/EAGLE
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Language:Python00
sunggg/leaderboard-backend
Open sourced backend for Martian's LLM Inference Provider Leaderboard
Language:Python00
sunggg/mlc-ai-package
Language:Shell00
sunggg/mlc-dev
Language:Python00
sunggg/mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
Language:Python00
sunggg/octoml_relax
A fork of tlc-pack/relax
Language:Python0 0 00
sunggg/One-Shot-Learning-with-Siamese-Networks
Implementation of One Shot Learning using Convolutional Siamese Networks on Omniglot Dataset
Language:Jupyter Notebook0 0 00
sunggg/relax
temp repo for prototyping, the effort will be upstreamed
Language:Python0 0 00
sunggg/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Language:Python0 0 00
sunggg/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python0 0 00
sunggg/web-llm
Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.
Language:Python0 0 00