/IntelIntrinsics

Primary LanguageC++MIT LicenseMIT

Intel Intrinsics

CHM documentation

This project builds a C# command-line app that parses Intel’s documentation into the developer-friendly CHM format.

The code quality is mediocre, because I don’t care: I only need the output.

You’ll find the compiled CHM file on “Releases” page.

If the fonts are too small for you, here’s a fix.

If you aren’t using Windows and don’t have HTML help viewer application, rename the .chm to .zip, unpack the content from “html” subfolder, and open “misc_All.html”, “misc_Categories.html” or “misc_Technologies.html” page in a web browser. You won’t have index nor search, but it’s still usable this way.

If you have downloaded the .chm from the releases page of this repository, but instead of the content you see a big white nothing, that's a security issue with help files downloaded from the Internets. Please right click on the CHM you’ve downloaded, click "Properties", check “Unblock” checkbox, and press OK.

Screenshot

C++ Wrappers

TLDR: please press the green “Clone or download” button, “Download ZIP”, copy IntelIntrinsics-master\CppDemo\Intrinsics into your project. See also the FAQ.

I’ve made a first version some time ago, and since that time I’ve been using the documentation extensively while working on my C++ code.

After using these intrinsics for a couple of years, some issues started to annoy me.

  1. The worst of all, there’s no compile-time platform checks. Because of this, it’s very easy to accidentally use e.g. SSSE3 intrinsic in the code, and introduce fatal crash “invalid instruction” on a customer’s PC who’s using old AMD CPU that doesn’t support supplementary SSE3.

  2. The API is C. This means type prefixes instead namespaces, no overloaded operators, no templates. When used wisely, these C++ features make code easier to both read and write.

  3. Some names like_mm_srli_epi16 are non-obvious, you can probably figure out that it’s right bit shift, but for the difference between that and _mm_sra_epi16 you’ll have to spend time reading the documentation.

At some point I’ve realized Intel’s documentation XML contains enough information to solve these problems. And because of this CHM documentation project, I already parse the XML into a machine-readable format.

So, this project now builds a set of C++ wrappers around SIMD intrinsics. All problems in computer science can be solved by another level of abstraction. The generated wrappers make a header-only library, with a bunch of inline C++ functions. Everything should be inlined i.e. this particular layer of abstractions is zero cost at runtime.

It solves #1 because wrapped functions are grouped in different headers based on the instruction set. Unless you include ssse3.hpp header, calls to abs_epi32() and the rest of them won’t compile.

It solves #2 by replacing prefixes with C++ namespaces. This allows some degree of metaprogramming, i.e. the same “max_ps” C++ function can be compiled to SSE instruction to process 4 values, or to AVX instruction to process 8 values. In each header file, intrinsics are grouped to namespaces by register size, i.e. “avx2.hpp” header contains two namespaces, “Intrinsics::Avx” with AVX2 intrinsics processing 32 bytes registers, and “Intrinsics::Sse” with AVX2 intrinsics processing 16 bytes registers.

It doesn’t solve #3 completely, but it helps by including documentation comments in the header files. Some IDEs show the comment in tool tips. All IDEs have “go to definition” command. That single lines of comment look trivial, but they make it easier to write SIMD code.