A curated list of awesome SIMD frameworks, libraries and software.
This list showcases projects that have achieved 10x performance improvements using SIMD (Single Instruction Multiple Data) instructions. In general this should lead to execution speeds of GBs per second on modern CPUs for that task at hand.
- sneller - Fast SQL for JSON in Go with AVX-512: fast, simple, schemaless
- bitmap - Go: Dense, zero-allocation, SIMD-enabled bitmap/bitset
- simdjson - C++: Parsing gigabytes of JSON per second
- dictionary - C++: High-performance dictionary coding
- simdjson-go - Go: Parsing gigabytes of JSON per second
- simdcomp - C: A simple library for compressing lists of integers using binary packing
- SIMDCompressionAndIntersection - C++: A library to compress and intersect sorted lists of integers using SIMD instructions
- simdutf - C++: Unicode routines (UTF8, UTF16, UTF32)
- Ada - C++: WHATWG-compliant and fast URL parser
- StringZilla - C: Substring search, edit-distances, sorting, fuzzy matching, etc.
- Hyperscan - C++: High-performance regular expression matching library
- Various string algo's - C: Repository for string algorithms, snippets, toy programs, etc.
- sse-popcount - SIMD (SSE) population count
- xxHash - C: Extremely fast Hash algorithm, processing at RAM speed limits.
- Reed-Solomon - Go: Erasure Coding in Go
- highwayhash - Go: Optimized HighwayHash implementation for Intel (over 10 GB/sec), ARM and Power9
- sha256-simd - Go: Optimized SHA256 computations for Intel, ARM and Power9
- ncnn - C++: High-performance NN inference framework optimized for mobile
- mkl-dnn - C++: Math Kernel Library for Deep Neural Networks
- nnpack - C/c++: Acceleration package for neural networks on multi-core CPUs
- SimSIMD - C: Similarity measures for high-dimensional vectors
- Simd - C++: image processing library making use of SIMD
- Pillow-SIMD - Python: SIMD version of PIL (Python Imaging Library)
- ComputeLibrary - C++: Library for Computer Vision and Machine Learning (ARM only)
- HPC-Class - High Performance Computing (HPC) class taught at FSU Jena by the Scalable Data- and Compute-intensive Analyses lab
- CPUlator - CPUlator Computer System Simulator (ARMv7, MIPS, RISC-V)
- SIMD-Visualiser - Javascript: Graphically visualize SIMD code
- Visual ARM emulator - VisUAL: a highly visual ARM emulator
- faster - Rust: SIMD for humans
- Vectorized Emulation - Accelerated taint tracking at 2 trillion instructions per second
- Agner Fog - Software optimization resources
- uops.info - Latency, throughput, and port usage information
- Felix Cloutier - x86 and amd64 instruction reference
- Compiler Explorer - Run compilers interactively from the browser and interact with the assembly
- awesome-asm - A curated list of awesome Assembler
- awesome-llvm - Curated list of awesome LLVM related docs, tools, and other resources
- awesome-decompilation - Curated list of awesome decompilation resources and projects.
- Intel Manual vol 1 (HTML)
- Intel Manual vol 2 (HTML)
- Intel Manual vol 3 (HTML)
- x86 documentation - x86 documentation
- Go assembly reference - Go assembly language complementary reference
- Intel® Intrinsics Guide - A list of all Intel® intrinsic functions for x86.
- avo - Go: Generate x86 Assembly with Go
- PeachPy - Python: x86-64 assembler embedded in Python
- c2goasm - Go: C to Go Assembly
- LLVM MCA - LLVM Machine Code Analyzer
- Highway - C++: Performance-portable, length-agnostic SIMD with runtime dispatch
- Eve - C++: Expressive Vector Engine
- SIMDe - C++: Header-only implementations of SIMD instruction sets (SSE*, AVX{,2,512}, Neon, and more) for systems which don't natively support them.
- xsimd - C++: Wrappers for SIMD intrinsics and math implementations (SSE, AVX, NEON, AVX512)
- Intel SDE debugging - Debugging with AVX-512
- Asm-Dude - VS extension for assembly syntax highlighting and code completion
- Intrinsics-Dude - VS extension for compiler instrinsics in C/C++
- Intel® Implicit SPMD Program Compiler - An LLVM compiler for a C like language, with C linkage, that generates very good SIMD instructions for a wide range of platforms and ISAs. (Windows, iOS, Linux, ARM, PS5, Xbox, SSE, AVX, AVX2, AVX-512, ARM NEON.)
- Online (dis-)assembler - Online assembler and disassembler
- ODA - Online disassembler (disassembler.io)
- x86/x64 SIMD Instructions (AVX512) - AVX-512 overview
- Golang's AVX512 - Go 1.11 introduction of AVX-512 support
- Golang AVX512 test data - Golang AVX-512 test instructions
- alexcrichton - AVX-512 overview
- Colfax: Capabilities of Intel AVX-512 - Capabilities of AVX-512
- Golang's ARM64 NEON support - Intro to arm64 assembler for Golang
- Golang ARM64 test data - Golang ARM64 (incl. NEON) test instructions
- SVE overview - SVE overview
- ARM SVE tools - ARM SVE tools
- AArch64 SoC features - AArch64 SoC features