avx2
There are 261 repositories under avx2 topic.
simdjson/simdjson
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
google/highway
Performance-portable, length-agnostic SIMD with runtime dispatch
HJLebbink/asm-dude
Visual Studio extension for assembly syntax highlighting and code completion in assembly files and the disassembly window
OpenNMT/CTranslate2
Fast inference engine for Transformer models
simd-everywhere/simde
Implementations of SIMD instruction sets for systems which don't natively support them.
microsoft/DirectXMath
DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
RoaringBitmap/CRoaring
Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, Redpanda, YDB and StarRocks
simdutf/simdutf
Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension, LoongArch64, POWER. Part of Node.js, WebKit/Safari, Ladybird, Chromium, Cloudflare Workers and Bun.
VcDevel/Vc
SIMD Vector Classes for C++
ashvardanian/SimSIMD
Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐
p12tic/libsimdpp
Portable header-only C++ low level SIMD library
jfalcou/eve
Expressive Vector Engine - SIMD in C++ Goes Brrrr
intel/x86-simd-sort
C++ template library for high performance SIMD based sorting algorithms
minio/highwayhash
Native Go version of HighwayHash with optimized assembly implementations on Intel and ARM. Able to process over 10 GB/sec on a single core on Intel CPUs - https://en.wikipedia.org/wiki/HighwayHash
libxsmm/libxsmm
Library for specialized dense and sparse matrix operations, and deep learning primitives.
mind/wheels
Performance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
powturbo/TurboPFor-Integer-Compression
Fastest Integer Compression
shibatch/sleef
SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
EgorBo/SimdJsonSharp
C# bindings for lemire/simdjson (and full C# port)
Auburn/FastNoiseSIMD
C++ SIMD Noise Library
rusticstuff/simdutf8
SIMD-accelerated UTF-8 validation for Rust.
lemire/fastbase64
SIMD-accelerated base64 codecs
Alex313031/Thorium-Win-AVX2
Repo to serve AVX2 Windows builds of Thorium. https://github.com/Alex313031/Thorium/
RobRich999/Chromium_Clang
Chromium browser compiled with the Clang/LLVM compiler.
WojciechMula/toys
Storage for my snippets, toy programs, etc.
kimwalisch/libpopcnt
🚀 Fast C/C++ bit population count library
WojciechMula/sse-popcount
SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html
RRZE-HPC/OSACA
Open Source Architecture Code Analyzer
agenium-scale/nsimd
Agenium Scale vectorization library for CPUs and GPUs
powturbo/Turbo-Base64
Turbo Base64 - Fastest Base64 SIMD:SSE/AVX2/AVX512/Neon/Altivec - Faster than memcpy!
powturbo/Turbo-Run-Length-Encoding
TurboRLE-Fastest Run Length Encoding
WojciechMula/sse4-strstr
SIMD (SWAR/SSE/SSE4/AVX2/AVX512F/ARM Neon) of Karp-Rabin algorithm's modification
hybridizer-io/hybridizer-basic-samples
Examples of C# code compiled to GPU by hybridizer
agenium-scale/boost.simd
Boost SIMD
lovell/highwayhash
Node.js implementation of HighwayHash, Google's fast and strong hash function
minio/md5-simd
Accelerate aggregated MD5 hashing performance up to 8x for AVX512 and 4x for AVX2. Useful for server applications that need to compute many MD5 sums in parallel.