Pinned Repositories
ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
.tmux
🇫🇷 Oh my tmux! My self-contained, pretty & versatile tmux configuration made with ❤️
6.828
Homework for https://pdos.csail.mit.edu/6.828
awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
clucene
my fork of clucene
codebase
my code base
db-readings
Readings in Databases
dmv
Geasemonkey script running in Firefox who automatically make a behind the wheel test appointment
interview
ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
scv119's Repositories
scv119/ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
scv119/awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
scv119/openmlsys-zh
《Machine Learning Systems: Design and Implementation》- Chinese Version
scv119/punica
scv119/CUDA-PPT
scv119/cutlass-kernels
scv119/FasterTransformer
Transformer related optimization, including BERT, GPT
scv119/flash-attention
Fast and memory-efficient exact attention
scv119/flashinfer
FlashInfer: Kernel Library for LLM Serving
scv119/grouped_gemm
PyTorch bindings for CUTLASS grouped GEMM.
scv119/how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
scv119/learn-rust
scv119/learning-nn
scv119/learning-triton
scv119/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
scv119/Lightrails
Yet another distributed training/inferencing framework.
scv119/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
scv119/megablocks
scv119/Megatron-LM
Ongoing research training transformer models at scale
scv119/mini-redis
Incomplete Redis client and server implementation using Tokio - for learning purposes only
scv119/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
scv119/og-equity-compensation
Stock options, RSUs, taxes — read the latest edition: www.holloway.com/ec
scv119/open_llama
scv119/orbit
A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.
scv119/r4cppp
Rust for C++ programmers
scv119/ScaleLLM
A high-performance inference system for large language models, designed for production environments.
scv119/scv119
scv119/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
scv119/The-Art-of-Linear-Algebra
Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"
scv119/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs