Pinned Repositories
RecoNIC
RecoNIC is a software/hardware shell used to enable network-attached processing within an RDMA-featured SmartNIC for scale-out computing.
ai_and_memory_wall
AI and Memory Wall blog post
byteps
A high performance and generic framework for distributed DNN training
cocotb
cocotb, a coroutine based cosimulation library for writing VHDL and Verilog testbenches in Python
cocotb-bus
Pre-packaged testbenching tools and reusable bus interfaces for cocotb
Dagger
HW/SW co-designed end-host RPC stack
fcc-intro-to-llms
High-Precision-Congestion-Control
how-to-optimize-gemm
k-diffusion
Karras et al. (2022) diffusion models for PyTorch
zhguanw-amd's Repositories
zhguanw-amd/ai_and_memory_wall
AI and Memory Wall blog post
zhguanw-amd/byteps
A high performance and generic framework for distributed DNN training
zhguanw-amd/Dagger
HW/SW co-designed end-host RPC stack
zhguanw-amd/fcc-intro-to-llms
zhguanw-amd/High-Precision-Congestion-Control
zhguanw-amd/how-to-optimize-gemm
zhguanw-amd/k-diffusion
Karras et al. (2022) diffusion models for PyTorch
zhguanw-amd/msccl-tools
Synthesizer for optimal collective communication algorithms
zhguanw-amd/octopus
An RDMA-enabled Distributed Persistent Memory File System
zhguanw-amd/open-nic-shell
AMD OpenNIC Shell includes the HDL source files
zhguanw-amd/page-info
Programatically obtain information about the pages backing a given memory region
zhguanw-amd/pcie_qdma_ats_example
zhguanw-amd/pciebench-netfpga
pcie-bench code for NetFPGA/VCU709 cards
zhguanw-amd/PyTorch-parameter-server
Implementation of Parameter Server using PyTorch communication lib
zhguanw-amd/qep-drivers
zhguanw-amd/rccl
ROCm Communication Collectives Library (RCCL)
zhguanw-amd/rdma_bench
A framework to understand RDMA
zhguanw-amd/rpclib
rpclib is a modern C++ msgpack-RPC server and client library
zhguanw-amd/wukong
A graph-based distributed in-memory store that leverages efficient graph exploration to provide highly concurrent and low-latency queries over big linked data
zhguanw-amd/xstore
Fast RDMA-based Ordered Key-Value Store using Remote Learned Cache