Pinned Repositories
Abstract-Algebra
learning math
alpa
Training and serving large-scale neural networks with auto parallelization.
awesome-database-learning
A list of learning materials to understand databases internals
Awesome-Places-for-Food-Drinks
cheetah-fastclick
FastClick with the Cheetah elements
huggingface-utils
MCU-project
VE373 final project on microprocessor based system
model-inference
utilities and tests for model inference
Multi-thread_DB
Experiment on multi-thread by implementing a database
MoE-Infinity
PyTorch library for cost-effective, fast and easy serving of MoE models.
drunkcoding's Repositories
drunkcoding/huggingface-utils
drunkcoding/model-inference
utilities and tests for model inference
drunkcoding/alpa
Training and serving large-scale neural networks with auto parallelization.
drunkcoding/Awesome-Places-for-Food-Drinks
drunkcoding/cheetah-fastclick
FastClick with the Cheetah elements
drunkcoding/core
The core library and APIs implementing the Triton Inference Server.
drunkcoding/CS411-Database-System
Project for database system -- an interactive website
drunkcoding/DeeperSpeed
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
drunkcoding/DLRM
drunkcoding/efficient-nlp
drunkcoding/eudyptula
linux kernel challenge
drunkcoding/falcon
FALCON - Fast Analysis of LTE Control channels
drunkcoding/flaxformer
drunkcoding/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
drunkcoding/jiant
jiant is an nlp toolkit
drunkcoding/MIT-6.824-Distributed-System
Spring 2020
drunkcoding/mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
drunkcoding/model-finetune
finetune pre-trained models
drunkcoding/onnxruntime_backend
The Triton backend for the ONNX Runtime.
drunkcoding/open-moe-llm-leaderboard
drunkcoding/power-meter
A software power measurement tool for both CPU and GPU using vendor provided API
drunkcoding/pytorch_backend
The Triton backend for the PyTorch TorchScript models.
drunkcoding/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
drunkcoding/ServerlessLLM
Fast, easy and cost-efficient multi-LLM serving.
drunkcoding/simple-shell
Simple functioning shell implemented in C
drunkcoding/swap-engine
drunkcoding/time-series-forecast
drunkcoding/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
drunkcoding/transformers-utils
transformers utilization made easy
drunkcoding/wasmint
Library for interpreting / debugging wasm code