gpu-acceleration
There are 699 repositories under gpu-acceleration topic.
tensorflow/tfjs
A WebGL accelerated JavaScript library for training and deploying ML models.
NVIDIA/TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
tensorflow/tfjs-core
WebGL-accelerated ML // linear algebra // automatic differentiation for JavaScript.
raphamorim/rio
A hardware-accelerated GPU terminal emulator focusing to run in desktops and browsers.
cornellius-gp/gpytorch
A highly efficient implementation of Gaussian Processes in PyTorch
NVIDIA/GenerativeAIExamples
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
Hedgehog-Computing/hedgehog-lab
Run, compile and execute JavaScript for Scientific Computing and Data Visualization TOTALLY TOTALLY TOTALLY in your BROWSER! An open source scientific computing environment for JavaScript TOTALLY in your browser, matrix operations with GPU acceleration, TeX support, data visualization and symbolic computation.
BlazingDB/blazingsql
BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
TianZerL/Anime4KCPP
A high performance anime upscaler
NVIDIA/cccl
CUDA Core Compute Libraries
coreylowman/dfdx
Deep learning in Rust, with shape checked tensors and neural networks
emacs-ng/emacs-ng
A new approach to Emacs - Including TypeScript, Threading, Async I/O, and WebRender.
calebwin/emu
The write-once-run-anywhere GPGPU library for Rust
beehive-lab/TornadoVM
TornadoVM: A practical and efficient heterogeneous programming framework for managed languages
stotko/stdgpu
stdgpu: Efficient STL-like Data Structures on the GPU
Liu-xiandong/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
EMI-Group/evox
Distributed GPU-Accelerated Framework for Evolutionary Computation. Comprehensive Library of Evolutionary Algorithms & Benchmark Problems.
NVlabs/sionna
Sionna: An Open-Source Library for Research on Communication Systems
Jaysmito101/TerraForge3D
Cross Platform Professional Procedural Terrain Generation & Texturing Tool
hughperkins/VeriGPU
OpenSource GPU, in Verilog, loosely based on RISC-V ISA
NVIDIA-Merlin/HugeCTR
HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training
dgasmith/opt_einsum
⚡️Optimizing einsum functions in NumPy, Tensorflow, Dask, and more with contraction order optimization.
coreylowman/cudarc
Safe rust wrapper around CUDA toolkit
eszdman/PhotonCamera
Android Camera that uses Enhanced image processing
NVIDIA-Merlin/Merlin
NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.
limbo018/DREAMPlace
Deep learning toolkit-enabled VLSI placement
iot-salzburg/gpu-jupyter
GPU-Jupyter: Your GPU-accelerated JupyterLab with a rich data science toolstack, TensorFlow and PyTorch for your reproducible deep learning experiments.
ttddee/Cascade
Node-based image editor with GPU-acceleration.
Sergio0694/NeuralNetwork.NET
A TensorFlow-inspired neural network library built from scratch in C# 7.3 for .NET Standard 2.0, with GPU support through cuDNN
DavidDiazGuerra/gpuRIR
Python library for Room Impulse Response (RIR) simulation with GPU acceleration
philferriere/dlwin
GPU-accelerated Deep Learning on Windows 10 native
MegviiRobot/MegBA
MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment
ProjectPhysX/OpenCL-Wrapper
OpenCL is the most powerful programming language ever created. Yet the OpenCL C++ bindings are cumbersome and the code overhead prevents many people from getting started. I created this lightweight OpenCL-Wrapper to greatly simplify OpenCL software development with C++ while keeping functionality and performance.
uncomplicate/bayadera
High-performance Bayesian Data Analysis on the GPU in Clojure
andrewmilson/ministark
🏃♂️💨 GPU accelerated STARK prover built on @arkworks-rs
Glavnokoman/vuh
Vulkan compute for people