Pinned Repositories
attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
auto-gptq-debug
Books-1
compute-sanitizer-samples
Samples demonstrating how to use the Compute Sanitizer Tools and Public API
FasterTransformer_llama_torch
Transformer related optimization, including BERT, GPT
gperftools
Main gperftools repository
incubator-tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
learn_cutlass
lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
sleepwalker2017's Repositories
sleepwalker2017/compute-sanitizer-samples
Samples demonstrating how to use the Compute Sanitizer Tools and Public API
sleepwalker2017/attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
sleepwalker2017/auto-gptq-debug
sleepwalker2017/Books-1
sleepwalker2017/FasterTransformer_llama_torch
Transformer related optimization, including BERT, GPT
sleepwalker2017/gperftools
Main gperftools repository
sleepwalker2017/incubator-tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
sleepwalker2017/learn_cutlass
sleepwalker2017/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
sleepwalker2017/mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
sleepwalker2017/MobileNet-v2-caffe
MobileNet-v2 experimental network description for caffe
sleepwalker2017/sanitizers
AddressSanitizer, ThreadSanitizer, MemorySanitizer
sleepwalker2017/ptb_text_only
sleepwalker2017/triton
sleepwalker2017/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs