simd-parallelism
There are 38 repositories under simd-parallelism topic.
google/highway
Performance-portable, length-agnostic SIMD with runtime dispatch
jfalcou/eve
Expressive Vector Engine - SIMD in C++ Goes Brrrr
lilohuang/PyTurboJPEG
PyTurboJPEG is a highly optimized Python wrapper of libjpeg-turbo (TurboJPEG API) which supports x86 and ARM architecture.
zeam-vm/pelemay
Pelemay is a native compiler for Elixir, which generates SIMD instructions. It has a plan to generate for GPU code.
gyrdym/ml_linalg
SIMD-based linear algebra and statistics for data science with dart
PatwinchIR/ultra-sort
DSL for SIMD Sorting on AVX2 & AVX512
Applied-Scientific-Research/Omega2D
Two-dimensional flow solver with GUI using vortex particle and boundary element methods
fzqneo/ByteSlice
"Byteslice: Pushing the envelop of main memory data processing with a new storage layout" (SIGMOD'15)
MarioSieg/Corium
Corium is a modern scripting language which combines simple, safe and efficient programming.
pleiszenburg/gravitation
n-body-simulation performance test suite
ZL-Su/Matrice
A portable modern C++ primitive performance library for 3D Vision & Photo-Mechanics.
Applied-Scientific-Research/Omega3D
GPU-accelerated 3D vortex methods solver with easy GUI
gregyjames/tsunami
A High Performance C# wrapper that allows you to get the benefits of SIMD Intrinsics on List<T>.
ms0g/vml
SIMD-accelerated Vector math lib
sahmad98/vstring
Vectroized String Helper Functions
sfegan/dft_simd
SIMD discrete Fourier transform tests and discussion
artem0/benchmarking
System benchmarks over JVM with JMH - SIMD (superscalar processing), Branch prediction, False sharing.
whtcorpsinc/einsteindb-prod
EinsteinDB is a Hybrid memory system consisting of DRAM and Non-Volatile Memory configured to persist data fast.
jeffamstutz/psimd
(experiments with) pragma-based SIMD C++ types
n-roussos/Parallel-Programming-with-OpenMP
This repository lists 4 problems solved using C. Each problem has its own serial and parallel implementations. For the latter, the OpenMP API was utilized.
cuongvng/Optimizing-Convolution-with-NEON-Intrinsics
Optimizing convolution function using ARM's NEON Intrinsics
nahuelcastro/Digital-Image-Processing-SSE
Image filters using SSE Instructions (Streaming SIMD Extensions) of Intel® x86-64 Architecture.
UCL-ARC/cluster_club_accelerated_python
Materials for ARC's cluster club session on accelerating scientific python codes
ell-hol/simd-parallelized-haar-transform
8x speedup of 1D Haar-Transform using intel SIMD intrinsics
falarion08/Dot-Product-Implementation
An implementation of dot product using CUDA, x64, and SIMD using the integer data type (32-bits) in C Language.
frederik-hoeft/simd
A fast and simple c# hex-decode function using AVX2 and SSSE3 Intel intrinsics.
kwanCCC/sorted-rs
check sequence is sorted or not but through SIMD
Nten0/parallel_computing
In this project we change the code of the SmithWaterman algorithm to achive parallel computing with different ways. University project for the course "Parallel Processing". Course Code: CEID_NY408
sunsided/mongo
The MongoDB database with SIMD-based dot_product aggregation on IEEE 754 single-precision vectors.
tugrul512bit/InverseFX
Computing a function when only its inverse is known, using Newson-Raphson method for 1D,2D,3D arrays in parallel.
huangfcn/dnnsimd
deep learning convolutional neural network implemented with SIMD acceleration (auto-vectorization)
kloongyu/ByteSlice
"Byteslice: Pushing the envelop of main memory data processing with a new storage layout" (SIGMOD'15)
t0re199/ARCHP_PROJECT
C & Assembly optimized version of the Stochastic Gradient Descent x SoftSVM x Polynomial Kernel Method algorithm
Blattvorhang/Parallel-Computing
Optimizing array computations through parallel computing across distributed systems and hardware resources for accelerated performance in C++.
kavindaperera/Distributed-Memory-Programming-with-MPI
Examples of Distributed-Memory Programming with MPI
ramesh-adhikari/HPC
High Performance Computing exercises