avx2
There are 227 repositories under avx2 topic.
simdjson/simdjson
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
HJLebbink/asm-dude
Visual Studio extension for assembly syntax highlighting and code completion in assembly files and the disassembly window
google/highway
Performance-portable, length-agnostic SIMD with runtime dispatch
OpenNMT/CTranslate2
Fast inference engine for Transformer models
simd-everywhere/simde
Implementations of SIMD instruction sets for systems which don't natively support them.
microsoft/DirectXMath
DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
RoaringBitmap/CRoaring
Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, and StarRocks
VcDevel/Vc
SIMD Vector Classes for C++
p12tic/libsimdpp
Portable header-only C++ low level SIMD library
simdutf/simdutf
Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension. Part of Node.js and Bun.
mind/wheels
Performance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
jfalcou/eve
Expressive Vector Engine - SIMD in C++ Goes Brrrr
minio/highwayhash
Native Go version of HighwayHash with optimized assembly implementations on Intel and ARM. Able to process over 10 GB/sec on a single core on Intel CPUs - https://en.wikipedia.org/wiki/HighwayHash
libxsmm/libxsmm
Library for specialized dense and sparse matrix operations, and deep learning primitives.
intel/x86-simd-sort
C++ template library for high performance SIMD based sorting algorithms
ashvardanian/SimSIMD
Up to 200x Faster Inner Products and Vector Similarity — for Python, JavaScript, Rust, C, and Swift, supporting f64, f32, f16 real & complex, i8, and binary vectors using SIMD for both x86 AVX2 & AVX-512 and Arm NEON & SVE 📐
powturbo/TurboPFor-Integer-Compression
Fastest Integer Compression
EgorBo/SimdJsonSharp
C# bindings for lemire/simdjson (and full C# port)
Auburn/FastNoiseSIMD
C++ SIMD Noise Library
rusticstuff/simdutf8
SIMD-accelerated UTF-8 validation for Rust.
lemire/fastbase64
SIMD-accelerated base64 codecs
Alex313031/Thorium-Win-AVX2
Repo to serve AVX2 Windows builds of Thorium. https://github.com/Alex313031/Thorium/
agenium-scale/nsimd
Agenium Scale vectorization library for CPUs and GPUs
WojciechMula/sse-popcount
SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html
WojciechMula/toys
Storage for my snippets, toy programs, etc.
kimwalisch/libpopcnt
🚀 Fast C/C++ bit population count library
powturbo/Turbo-Run-Length-Encoding
TurboRLE-Fastest Run Length Encoding
RRZE-HPC/OSACA
Open Source Architecture Code Analyzer
powturbo/Turbo-Base64
Turbo Base64 - Fastest Base64 SIMD:SSE/AVX2/AVX512/Neon/Altivec - Faster than memcpy!
agenium-scale/boost.simd
Boost SIMD
altimesh/hybridizer-basic-samples
Examples of C# code compiled to GPU by hybridizer
WojciechMula/sse4-strstr
SIMD (SWAR/SSE/SSE4/AVX2/AVX512F/ARM Neon) of Karp-Rabin algorithm's modification
lovell/highwayhash
Node.js implementation of HighwayHash, Google's fast and strong hash function
minio/md5-simd
Accelerate aggregated MD5 hashing performance up to 8x for AVX512 and 4x for AVX2. Useful for server applications that need to compute many MD5 sums in parallel.
Zielon/CPURasterizer
CPU Based Rasterizer Engine
manodeep/Corrfunc
⚡️⚡️⚡️Blazing fast correlation functions on the CPU.