Deepware

High Performance Computing on FPGA

Pinned Repositories

AccDNN
A compiler from AI model to RTL (Verilog) accelerator in FPGA hardware with auto design space exploration.
Language:Verilog0 1 00
antares
Antares: an automatic engine for multi-platform kernel generation and optimization. Supporting CPU, CUDA, ROCm, DirectX12, GraphCore, SYCL for CPU/GPU, OpenCL for AMD/NVIDIA, Android CPU/GPU backends.
Language:Python1 0 00
awesome-real-time-AI
This is a list of awesome edgeAI inference related papers.
1 0 00
edge-ai
A curated list of resources for embedded AI
10
gemm_spmm
Hardware accelerator for pruned nertworks
Language:C++1 0 00
How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the program on the GPU in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
Language:Cuda1 0 00
onnx-simplifier
Simplify your onnx model
Language:C++10
SEAsynth
A synthesize-able CNN accelerator based on systolic arrays 🌊
Language:Verilog2 0 00
TENNA
TENNA: Tiny Embedded Neural Network Accelerator
1 1 01
xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
Language:C++10

Deepware doesn’t have any repository yet.