wuchangping

Shanghai

wuchangping's Stars

CompVis/stable-diffusion
A latent text-to-image diffusion model
Language:Jupyter Notebook68.4k 557 71410.2k
google/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Language:Python30k 328 5.5k2.7k
ccfos/nightingale
An all-in-one observability solution which aims to combine the advantages of Prometheus and Grafana. It manages alert rules and visualizes metrics, logs, traces in a beautiful web UI.
Language:Go9.8k 157 1.2k1.4k
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Language:Python8.7k 76 558617
Syllo/nvtop
GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm
Language:C8.2k 78 247295
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++5.7k 109 1.1k976
NVIDIA/DALI
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
Language:C++5.2k 93 1.6k622
XuehaiPan/nvitop
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
Language:Python4.8k 24 86151
tianshiyeben/wgcloud
Linux运维监控工具，支持系统硬件信息，内存，CPU，温度，磁盘空间及IO，硬盘smart，GPU，防火墙，网络流量速率等监控，服务接口监测，大屏展示，拓扑图，端口监控，进程监控，docker监控，日志监控，文件防篡改，数据库监控，指令批量下发执行，web ssh，Linux面板(探针)，告警，SNMP监测，K8S，Redis，Nginx，Kafka，资产管理，计划任务，密码管理，工作笔记
Language:Java4.7k 77 61847
facebookincubator/AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Language:Python4.6k 82 244370
volcano-sh/volcano
A Cloud Native Batch System (Project under CNCF)
Language:Go4.2k 88 1.6k970
huggingface/optimum
🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
Language:Python2.6k 58 755470
NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
Language:Python2k 34 351328
alibaba/SREWorks
Cloud Native DataOps & AIOps Platform | 云原生数智运维平台
Language:Java1.8k 53 61402
facebook/CacheLib
Pluggable in-process caching engine to build and scale high performance services
Language:C++1.2k 52 137263
tkestack/gpu-manager
Language:Go832 23 172236
peci1/nvidia-htop
A tool for enriching the output of nvidia-smi.
Language:Python540 10 1459
MegEngine/MegCC
MegCC是一个运行时超轻量，高效，移植简单的深度学习模型编译器
Language:C++473 18 2256
NVIDIA/DCGM
NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs
Language:C++415 13 16455
microsoft/msccl
Microsoft Collective Communication Library
Language:C++324 12 2830
Mellanox/nv_peer_memory
Language:C315 30 7862
lsds/KungFu
Fast and Adaptive Distributed Machine Learning for TensorFlow, PyTorch and MindSpore.
Language:Go293 23 4458
openucx/ucc
Unified Collective Communication Library
Language:C207 24 11797
alibaba/GPU-scheduler-for-deep-learning
GPU-scheduler-for-deep-learning
Language:C++198 8 633
n9e/fe-v5
The web project for n9e
Language:TypeScript196 15 11475
SymbioticLab/Salus
Fine-grained GPU sharing primitives
Language:Jupyter Notebook140 10 2819
intel/MLSL
Intel(R) Machine Learning Scaling Library is a library providing an efficient implementation of communication patterns used in deep learning.
Language:C++109 23 1934
Hsword/Hetu
A high-performance distributed deep learning system targeting large-scale and automated distributed training. If you have any interests, please visit/star/fork https://github.com/PKU-DAIR/Hetu
Language:Python104 5 2045
microsoft/msccl-tools
Synthesizer for optimal collective communication algorithms
Language:Python99 9 2125
sbates130272/linux-p2pmem
A fork of the Linux kernel for p2pmem enabled devices like NVMe devices with CMBs, Microsemi NVRAM card (and other devices that can expose BARs) of the NVMe-oF target driver. For user-space test code see p2pmem-test repository.
Language:C28 17 516

wuchangping

wuchangping's Stars

CompVis/stable-diffusion

google/jax

ccfos/nightingale

facebookresearch/xformers

Syllo/nvtop

NVIDIA/cutlass

NVIDIA/DALI

XuehaiPan/nvitop

tianshiyeben/wgcloud

facebookincubator/AITemplate

volcano-sh/volcano

huggingface/optimum

NVIDIA/TransformerEngine

alibaba/SREWorks

facebook/CacheLib

tkestack/gpu-manager

peci1/nvidia-htop

MegEngine/MegCC

NVIDIA/DCGM

microsoft/msccl

Mellanox/nv_peer_memory

lsds/KungFu

openucx/ucc

alibaba/GPU-scheduler-for-deep-learning

n9e/fe-v5

SymbioticLab/Salus

intel/MLSL

Hsword/Hetu

microsoft/msccl-tools

sbates130272/linux-p2pmem