wuchangping's Stars
CompVis/stable-diffusion
A latent text-to-image diffusion model
google/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
ccfos/nightingale
An all-in-one observability solution which aims to combine the advantages of Prometheus and Grafana. It manages alert rules and visualizes metrics, logs, traces in a beautiful web UI.
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Syllo/nvtop
GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
NVIDIA/DALI
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
XuehaiPan/nvitop
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
tianshiyeben/wgcloud
Linux运维监控工具,支持系统硬件信息,内存,CPU,温度,磁盘空间及IO,硬盘smart,GPU,防火墙,网络流量速率等监控,服务接口监测,大屏展示,拓扑图,端口监控,进程监控,docker监控,日志监控,文件防篡改,数据库监控,指令批量下发执行,web ssh,Linux面板(探针),告警,SNMP监测,K8S,Redis,Nginx,Kafka,资产管理,计划任务,密码管理,工作笔记
facebookincubator/AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
volcano-sh/volcano
A Cloud Native Batch System (Project under CNCF)
huggingface/optimum
🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
alibaba/SREWorks
Cloud Native DataOps & AIOps Platform | 云原生数智运维平台
facebook/CacheLib
Pluggable in-process caching engine to build and scale high performance services
tkestack/gpu-manager
peci1/nvidia-htop
A tool for enriching the output of nvidia-smi.
MegEngine/MegCC
MegCC是一个运行时超轻量,高效,移植简单的深度学习模型编译器
NVIDIA/DCGM
NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs
microsoft/msccl
Microsoft Collective Communication Library
Mellanox/nv_peer_memory
lsds/KungFu
Fast and Adaptive Distributed Machine Learning for TensorFlow, PyTorch and MindSpore.
openucx/ucc
Unified Collective Communication Library
alibaba/GPU-scheduler-for-deep-learning
GPU-scheduler-for-deep-learning
n9e/fe-v5
The web project for n9e
SymbioticLab/Salus
Fine-grained GPU sharing primitives
intel/MLSL
Intel(R) Machine Learning Scaling Library is a library providing an efficient implementation of communication patterns used in deep learning.
Hsword/Hetu
A high-performance distributed deep learning system targeting large-scale and automated distributed training. If you have any interests, please visit/star/fork https://github.com/PKU-DAIR/Hetu
microsoft/msccl-tools
Synthesizer for optimal collective communication algorithms
sbates130272/linux-p2pmem
A fork of the Linux kernel for p2pmem enabled devices like NVMe devices with CMBs, Microsemi NVRAM card (and other devices that can expose BARs) of the NVMe-oF target driver. For user-space test code see p2pmem-test repository.