wyhmhs

University of Chinese Academy SciencesBeijing

Pinned Repositories

a100_workshop
Language:Cuda0 0 00
Abacus
Language:Python0 0 00
Awesome-DL-Scheduling-Papers
0 0 00
booksim
Language:C++0 0 00
casio
Language:Jupyter Notebook0 0 00
cocktail
Cocktail: A Multidimensional Optimization for Model Serving in Cloud (NSDI'22)
Language:Python1 0 00
MArk-Project
Exploiting Cloud Services for Cost-Effective, SLO-Aware Machine Learning Inference Serving
Language:Python1 0 00
ML-Accelerators
Topics in Machine Learning Accelerator Design
0 0 00
ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Language:Python1 0 00
synergy
Language:Python1 0 00

wyhmhs's Repositories

wyhmhs/synergy
Language:Python1 0 00
wyhmhs/booksim
Language:C++0 0 00
wyhmhs/casio
Language:Jupyter Notebook0 0 00
wyhmhs/ML-Accelerators
Topics in Machine Learning Accelerator Design
0 0 00
wyhmhs/confidential-computing-zoo
Confidential Computing Zoo provides confidential computing solutions based on Intel SGX, TDX, HEXL, etc. technologies.
Language:CMake0 0
wyhmhs/EarlyRobust
wyhmhs/FlameGraph
Stack trace visualizer
wyhmhs/hack-SysML
The road to hack SysML and become an system expert
0 0
wyhmhs/igniter
iGniter, an interference-aware GPU resource provisioning framework for achieving predictable performance of DNN inference in the cloud.
Language:Python0 0
wyhmhs/llama.cpp
LLM inference in C/C++
Language:C++0 0
wyhmhs/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
wyhmhs/LLMPerf-for-TiledArch
Analytical Performance Model for Tiled Accelerators/Dies in Spatial Architecture Running Large Language Models (LLMs)
wyhmhs/LoRA-ViT
Low rank adaptation for Vision Transformer
Language:Python
wyhmhs/megablocks
Language:Python0 0
wyhmhs/mixture-of-experts
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
Language:Python0 0
wyhmhs/Muri
Artifacts for our SIGCOMM'22 paper Muri
wyhmhs/NeuPIMs
NeuPIMs Simulator
Language:Jupyter Notebook0 0
wyhmhs/open-gpu-kernel-modules
NVIDIA Linux open GPU kernel module source
Language:C0 0
wyhmhs/PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
Language:Python0 0
wyhmhs/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
wyhmhs/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Language:C++0 0
wyhmhs/Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
Language:Jupyter Notebook0 0
wyhmhs/Proteus
Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling
Language:Python0 0
wyhmhs/rafiki
Rafiki is a distributed system that supports training and deployment of machine learning models using AutoML, built with ease-of-use in mind.
wyhmhs/RobustSSL_Benchmark
Benchmark of robust self-supervised learning (RobustSSL) methods & Code for AutoLoRa (ICLR 2024).
Language:Python0 0
wyhmhs/serve
Serve, optimize and scale PyTorch models in production
Language:Java0 0
wyhmhs/TEESlice-artifact
Language:C++0 0
wyhmhs/tutel
Tutel MoE: An Optimized Mixture-of-Experts Implementation
Language:Python0 0
wyhmhs/vmoe
wyhmhs/wyhmhs.github.io
Personal Website
Language:HTML