ziyang-arch

University of California, Riverside

ziyang-arch's Stars

meta-llama/llama
Inference code for Llama models
Language:Python56.8k 526 1k9.6k
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Language:Python37.3k 352 1.8k4.6k
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Language:Python36.2k 215 5.5k4.5k
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python35.9k 346 2.9k4.2k
pytorch/examples
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
Language:Python22.5k 399 6419.6k
pyg-team/pytorch_geometric
Graph Neural Network Library for PyTorch
Language:Python21.6k 254 3.6k3.7k
triton-lang/triton
Development repository for the Triton language and compiler
Language:C++13.7k 195 1.5k1.7k
AI4Finance-Foundation/FinRL
FinRL: Financial Reinforcement Learning. 🔥
Language:Jupyter Notebook10.3k 207 7232.5k
OpenMathLib/OpenBLAS
OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
Language:C6.5k 203 2.3k1.5k
ROCm/HIP
HIP: C++ Heterogeneous-Compute Interface for Portability
Language:C++3.8k 142 881540
NVIDIA/nccl
Optimized primitives for collective multi-GPU communication
Language:C++3.3k 154 1.3k836
juncongmoo/pyllama
LLaMA: Open and Efficient Foundation Language Models
Language:Python2.8k 35 93309
HuaizhengZhang/Awesome-System-for-Machine-Learning
A curated list of research in machine learning systems (MLSys). Paper notes are also provided.
2.6k 124 30301
NiuTrans/ABigSurvey
A collection of 1000+ survey papers on Natural Language Processing (NLP) and Machine Learning (ML).
2k 111 6240
NVIDIA-developer-blog/code-samples
Source code examples from the Parallel Forall Blog
Language:HTML1.2k 115 25634
pytorch/torchdynamo
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
Language:Python1k 45 567124
matrix-profile-foundation/matrixprofile
A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms, accessible to everyone.
Language:Python363 18 6462
microsoft/msccl
Microsoft Collective Communication Library
Language:C++325 12 2830
Zilize/DrawCV
Awesome CV template based on Draw.io. 基于 Draw.io 绘制的简历模板
313 1 026
KernelTuner/kernel_tuner
Kernel Tuner
Language:Python294 10 10750
yzhaiustc/Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
Language:Cuda289 7 745
astra-sim/astra-sim
ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale
Language:C++286 14 117119
ROCm/rccl
ROCm Communication Collectives Library (RCCL)
Language:C++280 35 103124
msr-fiddle/philly-traces
Language:Jupyter Notebook182 6 531
tukl-msd/DRAMPower
Fast and accurate DRAM power and energy estimation tool
Language:C++137 14 5647
yzhaiustc/Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F
Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.
Language:C116 5 123
microsoft/msccl-tools
Synthesizer for optimal collective communication algorithms
Language:Python99 9 2124
parasailteam/coconet
Language:HTML73 3 911
AMDResearch/DAGEE
Directed Acyclic Graph Execution Engine (DAGEE) is a C++ library that enables programmers to express computation and data movement, as task graphs that are scheduled concurrently and asynchronously on both CPUs and GPUs.
Language:C++44 6 38
ziyang-arch/Hybrid-Cooling-For-Data-Center
Language:MATLAB3 0 00

ziyang-arch

ziyang-arch's Stars

meta-llama/llama

lm-sys/FastChat

hiyouga/LLaMA-Factory

microsoft/DeepSpeed

pytorch/examples

pyg-team/pytorch_geometric

triton-lang/triton

AI4Finance-Foundation/FinRL

OpenMathLib/OpenBLAS

ROCm/HIP

NVIDIA/nccl

juncongmoo/pyllama

HuaizhengZhang/Awesome-System-for-Machine-Learning

NiuTrans/ABigSurvey

NVIDIA-developer-blog/code-samples

pytorch/torchdynamo

matrix-profile-foundation/matrixprofile

microsoft/msccl

Zilize/DrawCV

KernelTuner/kernel_tuner

yzhaiustc/Optimizing-SGEMM-on-NVIDIA-Turing-GPUs

astra-sim/astra-sim

ROCm/rccl

msr-fiddle/philly-traces

tukl-msd/DRAMPower

yzhaiustc/Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F

microsoft/msccl-tools

parasailteam/coconet

AMDResearch/DAGEE

ziyang-arch/Hybrid-Cooling-For-Data-Center