hardware-acceleration

There are 221 repositories under hardware-acceleration topic.

  • fpgaconvnet-hls

    Language:Python23
  • H264Dxva2Decoder

    A program to decode h264 video format with DirectX Video Acceleration 2, from scratch, using mp4 file with Avcc format. Movie atoms, Nal Unit, DXVA2, Mediafoundation, IDirectXVideoDecoder, IDirectXVideoProcessor.

    Language:C++87
  • tappas

    High-performance, optimized pre-trained template AI application pipelines for systems using Hailo devices

    Language:C++77
  • spatten

    [HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning

    Language:Scala66
  • la-core

    Linear algebra accelerators for RISC-V (published in ICCD 17)

  • hailort

    An open source light-weight and high performance inference framework for Hailo devices

    Language:C++61
  • community

    ROS 2 Hardware Acceleration Working Group community governance model & list of projects

  • bnn-icestick

    Binary Neural Network on IceStick FPGA.

    Language:Jupyter Notebook48
  • doppiodb

    doppioDB - A hardware accelerated database

    Language:C48
  • blaze

    A Rustified OpenCL Experience

    Language:Rust47
  • KRS

    The Kria Robotics Stack (KRS) is a ROS 2 superset for industry, an integrated set of robot libraries and utilities to accelerate the development, maintenance and commercialization of industrial-grade robotic solutions while using adaptive computing.

    Language:HTML46
  • newma

    Implementation of NEWMA: a new method for scalable model-free online change-point detection

    Language:Python46
  • GNN-ARCH

    [ASAP 2020; FPGA 2020] Hardware architecture to accelerate GNNs (common IP modules for minibatch training and full batch inference)

    Language:Verilog41
  • acceleration_examples

    ROS 2 package examples demonstrating the use of hardware acceleration.

    Language:C++40
  • Virtualization-Emulation-Guide

    Virtualization/Emulation Guide

    Language:C++39
  • specials

    Accurate, Hardware Accelerated, Special Functions in Mojo 🔥

    Language:Mojo33
  • ros_msft_onnx

    ONNX Runtime for the Robot Operating System (ROS), works on ROS1 and ROS2

    Language:C++33
  • inference-engine-node

    Bringing the hardware accelerated deep learning inference to Node.js and Electron.js apps.

    Language:JavaScript33
  • vivado-hls-broadcast-optimization

    [DAC 2020] Analysis and Optimization of the Implicit Broadcasts in FPGA HLS to Improve Maximum Frequency

    Language:Ada32
  • aes

    AES-128 hardware implementation

    Language:VHDL30
  • Needle

    Imperative deep learning framework with customized GPU and CPU backend

    Language:Python28
  • docker-tornadovm

    Docker build scripts for TornadoVM on GPUs: https://github.com/beehive-lab/TornadoVM

    Language:Shell28
  • cs231n-project

    CNN accelerator

    Language:Tcl26
  • CNN-ACCELERATOR

    Hardware accelerator for convolutional neural networks

    Language:Verilog23
  • arithmetic-encoder-av1

    This project is being developed as part of a Master's degree research sponsored by Brazil's CNPQ. It's goal is to design a hardware architecture to accelerate the AV1 arithmetic encoder.

    Language:Verilog23
  • hailort-drivers

    The Hailo PCIe driver is required for interacting with a Hailo device over the PCIe interface

    Language:C21
  • nexus

    Open source RTL simulation acceleration on commodity hardware

    Language:Python21
  • opu-benchmarks

    ML benchmarks performance featuring LightOn's Optical Processing Unit (OPU) vs CPU and GPU.

    Language:Python21
  • GeneSys

    An open-source parameterizable NPU generator with full-stack multi-target compilation stack for intelligent workloads.

    Language:Python20
  • STM32_NeuralNet_MovementDetection

    Motion recognition with artificial intelligence on STM32

    Language:C18
  • zynqmp-hailo-ai

    Ref design combining the Zynq UltraScale+ MPSoC with the Hailo AI accelerator

    Language:C++17
  • CompressedLUT

    A tool to generate optimized hardware files for univariate functions.

    Language:C++17
  • ExtendedBitPlaneCompression

    Provides the code for the paper "EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerators" by Lukas Cavigelli, Georg Rutishauser, Luca Benini.

    Language:Jupyter Notebook17
  • Voice-ML

    MobileNet trained with VoxCeleb dataset and used for voice verification

    Language:Python17
  • hardware-sort

    Hardware-accelerated sorting algorithm

    Language:VHDL17
  • android-qemu-launcher

    Utility and docs to run hardware-accelerated Android images on Linux QEMU KVM

    Language:Shell16