rocm
There are 130 repositories under rocm topic.
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
apache/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
cupy/cupy
NumPy & SciPy for GPU
lshqqytiger/stable-diffusion-webui-amdgpu
Stable Diffusion web UI
deepmodeling/deepmd-kit
A deep learning package for many-body potential energy representation and molecular dynamics
stotko/stdgpu
stdgpu: Efficient STL-like Data Structures on the GPU
PygmalionAI/aphrodite-engine
Large-scale LLM inference engine
ROCm/ROCm-docker
Dockerfiles for the various software layers defined in the ROCm software platform
alpaka-group/alpaka
Abstraction Library for Parallel Kernel Acceleration :llama:
ROCm/rocBLAS
Next generation BLAS implementation for ROCm platform
agenium-scale/nsimd
Agenium Scale vectorization library for CPUs and GPUs
JuliaGPU/AMDGPU.jl
AMD GPU (ROCm) programming in Julia
ROCm/k8s-device-plugin
Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster
ROCm/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
LLNL/hiop
HPC solver for nonlinear optimization problems
ROCm/aomp
AOMP is an open source Clang/LLVM based compiler with added support for the OpenMP® API on Radeon™ GPUs. Use this repository for releases, issues, documentation, packaging, and examples.
eth-cscs/COSMA
Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
ROCm/MIVisionX
MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.
ROCm/rocFFT
Next generation FFT implementation for ROCm
ROCm/gpufort
GPUFORT: S2S translation tool for CUDA Fortran and Fortran+X in the spirit of hipify
ROCm/rocPRIM
ROCm Parallel Primitives
GPUOpen-ProfessionalCompute-Libraries/amdovx-core
AMD OpenVX Core -- a sub-module of amdovx-modules:
virchau13/automatic1111-webui-nix
AUTOMATIC1111/stable-diffusion-webui for CUDA and ROCm on NixOS
l1na-forever/stable-diffusion-rocm-docker
Stable Diffusion Docker image preconfigured for usage with AMD Radeon cards
patientx/ComfyUI-Zluda
The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface. Now ZLUDA enhanced for better AMD GPU performance.
electronic-structure/SIRIUS
Domain specific library for electronic structure calculations
ROCm/hipBLAS
ROCm BLAS marshalling library
ROCm/rocRAND
RAND library for HIP programming language
GPUOpen-ProfessionalCompute-Libraries/amdovx-modules
AMD OpenVX modules: such as, neural network inference, 360 video stitching, etc.
Grench6/RX580-rocM-tensorflow-ubuntu20.4-guide
Install guide of ROCm and Tensorflow on Ubuntu for the RX580
ROCm/rocSOLVER
Next generation LAPACK implementation for ROCm platform
sukhmeetbawa/OpenCL-AMD-Fedora
AMD OpenCL userspace drivers for Fedora. Currently not working for fedora 37
EmbeddedLLM/vllm-rocm
vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs
GPUOpen-Tools/radeon_compute_profiler
The Radeon Compute Profiler (RCP) is a performance analysis tool that gathers data from the API run-time and GPU for OpenCL™ and ROCm/HSA applications. This information can be used by developers to discover bottlenecks in the application and to find ways to optimize the application's performance.
PennyLaneAI/pennylane-lightning
The PennyLane-Lightning plugin provides a fast state-vector simulator written in C++ for use with PennyLane