Tracking issue for stable SIMD in Rust

Question

Tracking issue for stable SIMD in Rust

alexcrichton opened this issue 7 years ago · 98 comments

This is a tracking issue for RFC 2325, adding SIMD support to stable Rust. There's a number of components here, including:

#[target_feature(enable = ...)]
Names of accepted target features - currently proposed
#[cfg(target_feature = ...)]
is_target_feature_detected! macro
std::arch naming and submodules

The initial implementation of this is being added in #48513 and the next steps would be:

Adjust documentation (see instructions on forge)
Stabilization PR (see instructions on forge)

Known issues

is_target_feature_detected! takes different arguments that #[target_feature]

Answer 1 · 2018-02-26T21:40:14.000Z

My one request for the bikeshed (which the current PR already does and may be obvious, but I'll write it down anyway): Please ensure they're not all in the same module as things like undefined_behaviour and [un]likely, so that those rust-defined things don't get lost in the sea of vendor intrinsics.

Answer 2 · 2018-02-26T21:55:20.000Z

What will be the story for external LLVM? (lacking MCSubtargetInfo::getFeatureTable())

Answer 3 · 2018-02-26T22:59:22.000Z

@scottmcm certainly! I'd imagine that if we ever stabilized Rust-related intrinsics they'd not go into the same module (they probably wouldn't even be platform-specific).

@cuviper currently it's an unresolved question, so if it doesn't get fixed it means that using an external LLVM would basically mean that #[cfg(target_feature = ...)] would always expand to false (or the equivalent thereof)

Answer 4 · 2018-02-26T23:07:47.000Z

I'd imagine that if we ever stabilized Rust-related intrinsics they'd not go into the same module (they probably wouldn't even be platform-specific).

One option raised in the RFC thread (that I personally quite like) was stabilizing std::intrinsics (only the module), keep the stable rust intrinsics in that module (they can already be imported from that location due to a long-standing bug in stability checking) and put these new platform-specific intrinsics in submodules. IIUC this would also satisfy @scottmcm's request.

To be explicit, under that plan the rustdoc page for std::intrinsics would look like this:

Modules

x86_64
arm
...

Functions

copy
copy_nonoverlapping
drop_in_place
...

Answer 5 · 2018-03-03T03:27:12.000Z

Another naming idea I've just had. Right now the feature detection macro is is_target_feature_enabled!, but since it's so target specific it may be more apt to call it is_x86_target_feature_enabled!. This'll make it a pain to call on x86/x86_64 though which could be a bummer.

Answer 6 · 2018-03-05T13:59:51.000Z

Why keep all the leading underscores for the intrinsics? Surely even if we keep the same names as what the vendors chose, we can still remove those signs, right?

Answer 7 · 2018-03-05T14:05:54.000Z

The point is to expose vendor APIs. The vendor APIs have underscores. Therefore, ours do too.

Answer 8 · 2018-03-05T14:09:51.000Z

It is debatable that those underscores are actually part of the name. They only have one because C has no modules and namespacing, AFAICT.

Answer 9 · 2018-03-05T14:11:32.000Z

I would be happy dropping the topic if it was discussed at length already, but I couldn't find any discussion specific to them leading underscores.

Answer 10 · 2018-03-05T14:13:31.000Z

@nox rust-lang/stdarch#212 --- My comment above is basically a summary of that. I probably won't say much else on the topic.

Answer 11 · 2018-03-05T14:19:13.000Z

@nox, @BurntSushi Continuing the discussion from there... since it hasn't been mentioned before:

Leading _ for identifiers in rust often means "this is not important" - so just taking the names directly from the vendors may wrongly give this impression.

Answer 12 · 2018-03-05T16:18:37.000Z

@nox @Centril the recurring theme of stabilizing SIMD in Rust is "it's not our job to make this nice". Any attempt made to make SIMD different than what the vendors define has ended with uncomfortable questions and intrinsics that end up being left out. To that end the driving force for SIMD intrinsics in Rust is to get anything compiling on stable.

Crates like faster are explicitly targeted at making SIMD usage easy, fast, and ergonomic. The standard library's intrinsics are not intended to be widely used nor use for "intro level" problems. Leveraging the SIMD intrinsics is quite unsafe (due to target feature detection/selection) and can come at a high cost if used incorrectly.

Overall, again, the goal is to not enable ergonomic SIMD in Rust right now, but any SIMD in Rust. Following exactly what the vendors say is the easiest way for us to guarantee that all Rust programs will always have access to vendor intrinsics.

Answer 13 · 2018-03-05T16:27:24.000Z

I agree that the leading underscores are a C artifact, not a vendor choice (the C standard reserves identifiers of this form, so that's what C compilers use for intrinsics). Removing them is neither "trying to make it nicer/more ergonomic" (it's really only a minor aesthetic difference) nor involves any per-intrinsic judgement calls. It's a dead simple mechanical translation for a difference in language rules, almost as much as __m128 _mm_foo(); is mechanically translated to fn _mm_foo() -> __m128;.

Answer 14 · 2018-03-05T16:33:16.000Z

@rkruppe do we have a rock solid guarantee that no vendor will ever in the future add the same name without underscores?

Answer 15 · 2018-03-05T16:42:22.000Z

@alexcrichton

@rkruppe do we have a rock solid guarantee that no vendor will ever in the future add the same name without underscores?

Can't speak for CPU vendors, but the probability seems very very low. Why would they add an intrinsic where the difference is only an underscore..? Further, as Rust's influence grows, they might not do this simply because of Rust.

Answer 16 · 2018-03-05T16:43:07.000Z

A name like mm_foo (no leading underscore at all) is not reserved in the C language, so it can't be used for compiler-supplied extensions without breaking legal C programs. There are a few theoretical possibilities for a vendor to nevertheless create intrinsics without leading underscores:

they could expose it only in C++ (with namespacing) -- or, for that matter, another language that isn't C
they could break legal C programs (very unlikely, and I'll eat my hat if GCC or Clang developers accept this)
A future version of C adds some way of doing namespacing, and people start using it for intrinsics

All extremely unlikely. The first one seems like the only one that doesn't sound like science fiction to me, and if that happens we'd have other problems anyway (such intrinsics may use function overloading and other features Rust doesn't have).

Answer 17 · 2018-03-05T17:00:14.000Z

It is debatable that those underscores are actually part of the name. They only have one because C has no modules and namespacing, AFAICT.

This. The whole point is that the underscore-leading names were chosen so as to specifically not clash with user-defined functions. Which means they should never be using non-underscore names. It's against well-established C conventions. Hence, we should just rename them to follow Rust conventions, with no real chance there will be any name clash in the future, providing the vendors stay sane and respect C conventions.

Answer 18 · 2018-03-05T17:52:33.000Z

@Centril "probability seems very very low" is what I would say as well, but we're talking about stability of functions in the standard library, so "low probability" won't cut it unfortunately.

@rkruppe I definitely agree, yeah, but "extremely unlikely" to me says "follow the vendor spec to the letter and we can figure out ergonomics later".

Answer 19 · 2018-03-05T17:54:12.000Z

Another point worth mentioning for staying exactly to the upstream spec is that I believe it actually boosts learnability. You'll have instant familiarity with any SIMD/intrinsic code written in C, of which there's already quite a lot!

If we stray from the beaten path then we'll have to have a section of the documentation which is very clear about defining the mappings between intrinsic names and what we actually expose in Rust.

Answer 20 · 2018-03-05T17:54:13.000Z

I don't think renaming (no leading underscore or any other alteration) is useful. This is simply not the goal and only introduces pain points. I cannot think of a reason other than "i like that more" to justify that. It only introduces the possibility to naming clashes and "very very unlikely" is not convincing because we can prevent this 100% by not doing it altogether.

I think its the best choice to follow the vendor naming schema as close as possible and i think we should even break compatibility if we ever introduce an error in the "public API" without doing some renaming like _mm_intr_a to _mm_intr_a2 and start diverging the exact naming schema introduced by the vendor.

Answer 21 · 2018-03-05T17:54:17.000Z

@alexcrichton But as @rkruppe said, removing the leading underscore isn't about ergonomics, it's about not porting C defects to Rust blindly.

Answer 22 · 2018-03-05T17:56:04.000Z

Sorry for the double post, but I also want to add that arguing that a vendor may release an unprefixed intrinsic with the same name as a prefixed one is to me as hypothetical as arguing that bool may not be a single byte on some platform we would like to support.

Answer 23 · 2018-03-05T18:03:07.000Z

@nox but why stop by the _? We could also fully rename the function with ps and pd into f32 and f64 which would be something "more Rust". Its somewhat arbitrary to just remove the leading underscore. And we could argue back and forth what is ergonomics and what isn't but i don't think there is a very good line to distinguish that to a point every body agrees.

Answer 24 · 2018-03-05T18:06:15.000Z

@pythoneer Because the name is what the vendor decided, with a leading underscore because of nondescript limitations of C.

Answer 25 · 2018-03-05T18:09:05.000Z

@nox and the explicit goal of stdsimd is to expose this (however defect) vendor defined interface.

Answer 26 · 2018-03-05T18:12:57.000Z

@nox and the explicit goal of stdsimd is to expose this (however defect) vendor defined interface.

Interface, sure, but not necessarily the naming conventions!

Answer 27 · 2018-03-05T18:16:16.000Z

@alexreg ps is also a naming convention, do you want that also to be changed?

Answer 28 · 2018-03-05T18:45:07.000Z

@alexcrichton

"low probability" won't cut it unfortunately.

I think it should. This low probability isn't like 10% or even 1%, but like 0.00001% or so (yeah; I added a bunch of 0s, but I think it is justified). We can also make the probability 0% by notifying vendors of our naming convention so that they never add both _abc and abc.

Answer 29 · 2018-03-05T19:01:59.000Z

@Centril I tend to agree... I mean, have you ever seen a version of the C stdlib or a compiler that defines intrinsics without using underscores to prefix their names? I haven't.

Answer 30 · 2018-03-05T19:45:08.000Z

@Centril
Somehow I doubt vendors care what the Rust community thinks about how they should name their intrinsics. You think they're gonna keep a list of how everyone wants things to be named and follow that strictly?

Who cares if it has underscores? All this vendor specific stuff will just be wrapped by cleaner nicer to use APIs anyways. If you want it to have Rust-like names for your types, publish a crate with type aliases over the vendor intrinsic names, put it on your resume and call it a day.

There is also tons of documentation using the names those vendors created. Do we really want to create extra confusion and add the burden of having to maintain our own documentation on the SIMD intrinsics? Seems like there's a lot more useful things people could be working on than arguing over two characters preceding the type names... Like maybe... implementing SIMD.

Answer 31 · 2018-03-05T20:32:30.000Z

@tdbgamer

Somehow I doubt vendors care what the Rust community thinks about how they should name their intrinsics.

All I can say is that we shouldn't underestimate the pull a language like Rust can have, especially as our reach grows.

Answer 32 · 2018-03-05T21:53:50.000Z

@Centril
Sure, and if this were something intended for everyday users of Rust I'd agree that intuitive, rust specific names matter. But my understanding is that this isn't really intended to be used by everyone. This will need to be wrapped in a much more user-friendly library for us mortals to use regardless of how these types are named. The whole point of this is to just give people access to it now so we can start building more ergonomic libraries on top of it.

Answer 33 · 2018-03-05T22:03:15.000Z

I want to emphasize again that I don't think there's any "ergonomics" gains or any "accessibility" or anything like that to be gained from removing the leading underscore from these identifiers. The names are far, far down the list of problems with this kind of intrinsic, and removing the underscore is only a minuscle change to the names. It would be solely a consistency thing.

And I'm increasingly unconvinced whether that's worth doing. Muscle memory might make it so that the kind of person most likely to use these intrinsics in Rust would be more annoyed than anything by the difference, even though it is completely mechanical.

Answer 34 · 2018-03-05T22:22:05.000Z

Porting naming conventions simply does not make sense though.

Answer 35 · 2018-03-05T22:24:06.000Z

but why stop by the _

Because you can google for mm_add_pi32 and it'll include _mm_add_pi32. If you google for _mm_add_f32 you get nothing.

But I do think overall that having the exact same name as https://software.intel.com/sites/landingpage/IntrinsicsGuide/ is a good plan.

Answer 36 · 2018-03-05T23:58:28.000Z

@alexreg

have you ever seen a version of the C stdlib or a compiler that defines intrinsics without using underscores to prefix their names?

FWIW the Arm NEON instrinsics don't have an underscore but, AFAIK, they're only available if you include the appropriate header.

@alexcrichton Hypothetically speaking, if a vendor specified how they preferred the instrinsics to be modularized and named in Rust, would you be fine with that?

Answer 37 · 2018-03-06T00:08:05.000Z

Underscore prefixes aren't just a convention, they are reserved by the C Standard for compilers and vendors to use. So using anything else for the vendors just straight up is considered breaking as those identifiers are not reserved:

1 Each header declares or defines all identifiers listed in its associated subclause, and
optionally declares or defines identifiers listed in its associated future library directions
subclause and identifiers which are always reserved either for any use or for use as file
scope identifiers.

— All identifiers that begin with an underscore and either an uppercase letter or another
underscore are always reserved for any use.
— All identifiers that begin with an underscore are always reserved for use as identifiers
with file scope in both the ordinary and tag name spaces.
— Each macro name in any of the following subclauses (including the future library
directions) is reserved for use as specified if any of its associated headers is included;
unless explicitly stated otherwise (see 7.1.4).
— All identifiers with external linkage in any of the following subclauses (including the
future library directions) and errno are always reserved for use as identifiers with
external linkage.184)
— Each identifier with file scope listed in any of the following subclauses (including the
future library directions) is reserved for use as a macro name and as an identifier with
file scope in the same name space if any of its associated headers is included.

2 No other identifiers are reserved. If the program declares or defines an identifier in a
context in which it is reserved (other than as allowed by 7.1.4), or defines a reserved
identifier as a macro name, the behavior is undefined.

Therefore this is clearly a just C artifact.

Answer 38 · 2018-03-06T01:21:08.000Z

Thanks for providing the supporting evidence, @CryZe. I still strongly believe we shouldn't be porting a "C artifact", as you correctly point out.

Answer 39 · 2018-03-06T07:43:26.000Z

Underscore prefixes aren't just a convention, they are reserved by ...

Do the historical details of why the underscores are there really matter?
We are implementing a well-known third-party interface and they are part of the spec, that's what is important.

Answer 40 · 2018-03-06T12:56:40.000Z

The point is that the underscore prefixes are a namespacing scheme introduced in the C standard for the compilers and vendors. So for all intents and purposes, that‘s just how C mangles its namespaces into the identifiers.

I‘m definitely not entirely sure if it really does make sense to remove those prefixes in Rust, as all the C based documentation has them. However I also want to point out that when looking at C documentation you already have to do a certain amount of "translation" from the C documentation to actually using it in Rust, as there‘s not only syntactical differences, but also already differences in the type names, as seen in rust‘s libc crate: c_int, c_short, c_long, ...

So it wouldn‘t be unprecedented to remove the namespacing from the vendor instructions, as we 1. already have different names in libc for the type names at least and 2. underscore already has a different meaning prescribed to in Rust, which could lead to potential footguns where you forget to use a value, but it‘s not shown because you thought you are supposed to use an underscore prefix with these vendor instructions.

Answer 41 · 2018-03-06T13:06:02.000Z

underscore already has a different meaning prescribed to in Rust, which could lead to potential footguns where you forget to use a value, but it‘s not shown because you thought you are supposed to use an underscore prefix with these vendor instructions.

If it weren't for this, I'd be in favour of keeping them for documentation consistency since they're so low-level.

However, given all of the arguments on the topic of how harmless it would be to remove them, I think that, integrating properly with Rust's "meant to be unused" conventions is more important.

(After all, the consistency and comprehensiveness of these sorts of compile-time checks and lints are the reason Rust's strengths can't simply be retrofitted into established languages like C++ which need to retain backwards compatibility with their old design shortcomings.)

Answer 42 · 2018-03-06T15:17:00.000Z

Procedural note: this tracking issue has gained ~40 comments about leading underscores in the space of less than 24 hours. Many of the posts appear to be re-iterating points that have already been made earlier in the thread. Before commenting, please consider whether your argument (or something very similar) has already been made.

Answer 43 · 2018-03-06T23:32:37.000Z

I've posted a PR for renaming the is_target_feature_detected! macro.

Answer 44 · 2018-03-31T21:15:06.000Z

@rfcbot fcp merge

While we merged this into the standard library relatively recently SIMD has been in the works for a very long time now and I think we're in a very good place to stabilize it. I'd like to propose that this tracking issue is stabilized, namely the std::arch module and the x86, x86_64 submodules. Language features stabilized here are things like #[target_feature], with the full list of things being stabilized still at the top of this issue

Answer 45 · 2018-03-31T21:15:07.000Z

Team member @alexcrichton has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

Answer 46 · 2018-04-03T09:15:50.000Z

@rfcbot approved

Answer 47 · 2018-04-03T09:16:56.000Z

@rfcbot reviewed

(Whoops)

Answer 48 · 2018-04-03T09:16:58.000Z

🔔 This is now entering its final comment period, as per the review above. 🔔

Answer 49 · 2018-04-04T19:44:32.000Z

Assuming the FCP progresses smoothly I've posted a PR implementing the stabilization here at #49664

Answer 50 · 2018-04-05T06:23:26.000Z

Will it be possible to migrate to something like #[target(feature="..")] and related macros in the future? I know it's a bit late to rename things, but it will be unfortunate if we'll end up with duplicated functionality.

Answer 51 · 2018-04-05T13:55:46.000Z

Is cfg(target_feature) still supposed to be context sensitive? That is still a terrible idea, but it doesn't appear to be so on nightly.

Answer 52 · 2018-04-05T14:05:58.000Z

@Zoxc you may wish to comment on #42515, the issue dedicated to that.

@newpavlov at this point I think it's a bit late to hold up stabilization of SIMD on a pre-RFC, but if that ends up being stabilized we can always rename via deprecation.

Answer 53 · 2018-04-13T09:20:19.000Z

The final comment period is now complete.

Answer 54 · 2018-05-08T10:51:43.000Z

I don't know if it's too late to still tune things here, but the original RFC had two features that were changed during the discussion over there:

the submitted RFC put all intrinsics in std::arch::*, the revised RFC in std::arch::{arch_name}.
the submitted RFC used is_feature_detected! for run-time feature detection, the revised RFC uses is_{arch_name}_feature_detected!

The RFC was accepted before those changes were made. The changes were made in the RFC at the end of February, implemented at the beginning of March, and the FCP went through mid April. Right now we have ~2 month of experience with these changes

In any case, going through the RFC, I cannot pin point any concrete argument about why:

the intrinsics of each architecture should be in a different std::arch::{arch_name} module,
the architecture name should be part of the is_..._feature_detected! macros.

In particular, std::arch only contains one single module, the one of the current architecture, and that's it. Also, there is only one is_..._feature_detected! macro re-exported, the one of the current architecture.

These last-minute changes make it more painful than necessary to write code even for x86, where one has to:

#[target_feature(enabled = "sse3")]
unsafe fn foo() {
    #[cfg(target_feature = "x86")] use core::arch::x86::*;
    #[cfg(target_feature = "x86_64")] use core::arch::x86_64::*;
    /* ... */
}

all over the place, or at the top level, to avoid having to do this all over the place. Things don't get better when targeting multiple architectures. What before was horrible:

#[cfg_attr(any(target_arch = "x86", target_arch = "x86_64"), target_feature(enable = "sse4.2"))] 
#[cfg_attr(any(target_arch = "arm", target_arch = "aarch64"), target_feature(enable = "neon"))] 
unsafe foo() {
    use core::arch::*;

     #[cfg(any(target_arch = "x86", target_arch = "x86_64"))] {
         if is_feature_detected!("avx2") { ... } else { ... }
     }
     #[cfg(any(target_arch = "arm", target_arch = "aarch64"))] {
        if is_feature_detected!("crypto") { ... } else { ... }
     }  
}

now is worse:

#[cfg_attr(any(target_arch = "x86", target_arch = "x86_64"), target_feature(enable = "sse4.2"))] 
#[cfg_attr(any(target_arch = "arm", target_arch = "aarch64"), target_feature(enable = "neon"))] 
unsafe foo() {
    #[cfg(target_arch = "x86")]  use core::arch::x86::*; 
    #[cfg(target_arch = "x86_64")]  use core::arch::x86_64::*
    #[cfg(target_arch = "arm")] use core::arch::arm::*;
    #[cfg(target_arch = "aarch64")] use core::arch::aarch64::*; 

     #[cfg(any(target_arch = "x86", target_arch = "x86_64"))] {
         if is_x86_feature_detected!("crypto") { ... } else {  ... }
     }
     #[cfg(target_arch = "arm")] {
        if is_arm_feature_detected!("crypto") { ... } else { ... }
     }
     #[cfg(target_arch = "aarch64")] {
        if is_aarch64_feature_detected!("crypto") { ... } else { ... }
     }
}

This is particularly worrying if we want to add new "feature sets" for ergonomics like simd128 and simd256 since before the changes the above would just become:

#[target_feature(enable = "simd128")] 
unsafe foo() {
    use core::arch::*;
     if is_feature_detected!("crypto") { ... } else { ... }
}

I remember that to me they sounded like a potentially good idea back then, so I did not gave them more thought (I was more in the "I want SIMD now" mood). But now that the love story has faded and I've had the chance to use them a couple of times, I've clashed against them every single time:

inside coresimd: in the std docs, in the portable vector types, in run-time feature detection, and many more...
inside is_sorted: here, amongst others...
again yesterday night while porting aobench to Rust simd

Anyways, can somebody summarize why doing those two changes were a good idea?

In particular for the first change of putting the intrinsics in std::arch::{arch_name}, AFAIK we are never going to add more modules to std::arch because that would mean that the current code is being compiled for two archs at the same time, and in that case, one arch shouldn't be able to access the intrinsics of the other anyways. For the run-time feature detection macros, the benefits are smaller (but still there), since each arch has different intrinsics. But one idiom I would like to use is:

#[cfg(target_arch = "arm")]
#[target_feature(enable = "simd128")]
unsafe fn bar() { ... }

#[cfg(target_arch = "aarch64")]
#[target_feature(enable = "simd128")]
unsafe fn bar() { ... }

#[cfg(target_arch = "x86_64")]
#[target_feature(enable = "simd128")]
unsafe fn bar() { ... }

fn foo() {
   if is_feature_detected("simd128") { bar() } else { fallback() }
}

and the named macros wouldn't allow that.

There are two ways of fixing this in a backwards compatible way:

re-exporting all of std::arch::{arch_name}::* via, e.g., std::arch::current::*
adding a is_feature_detected!("...") macro that dispatches to the named ones depending on the architecture.

So I don't think we should block landing this on these ergonomic issues. In any case, I don't feel I understand the real reasons behind the change, so maybe adding these conveniences defeats their purpose.

cc @alexcrichton @rkruppe @eddyb @hsivonen @BurntSushi @Ericson2314 (those who had opinions about this in the RFC)

Answer 55 · 2018-05-08T13:53:50.000Z

@gnzlbg this was something I forgot about in the original RFC personally. In the standard library anything that isn't portable currently stylistically requires the "non portable part of it" to appear in the path you use it. For example Windows-specific functionality is at std::os::windows. Following suit for SIMD, architecture-specific intrinsics, was natural to place in submodules of std::arch as a warning that what you're using is indeed not portable and specific to only one platform.

The name of the macro was the same rationale, ensuring that you aren't tricked to thinking it can be invoked in a portable context but rather explicitly specifying that it's not portable.

Answer 56 · 2018-05-09T06:32:04.000Z

In the standard library anything that isn't portable currently stylistically requires the "non portable part of it" to appear in the path you use it. For example Windows-specific functionality is at std::os::windows. Following suit for SIMD, architecture-specific intrinsics, was natural to place in submodules of std::arch as a warning that what you're using is indeed not portable and specific to only one platform.

Is this something that will be covered with the new portability lint? Also, by that rationale, should everything in std::arch be in target feature submodules?

Answer 57 · 2018-05-09T13:52:48.000Z

@parched ideally, yes! If that exists we could perhaps consider moving everything wholesale to different modules.

Answer 58 · 2018-05-09T14:18:32.000Z

we could perhaps consider moving everything wholesale to different modules.

For x86/x86_64 this should be easily doable since we already do this internally in stdsimd. For other platforms we can do this in a best effort basis.

Answer 59 · 2018-05-23T11:54:18.000Z

core::simd::FromBits still points to this issue. Shouldn't it point to an open issue?

Answer 60 · 2018-05-29T10:35:49.000Z

So should we do the changes? (add is_x86_64_feature_detected, expose the feature submodules instead of all intrinsics directly, ...) We don't have much time to do this if we want to, and I could do this on Friday this week.

Answer 61 · 2018-05-29T16:17:40.000Z

Er sorry I misread, I think. I do not think we should change anything. Perhaps one day intrinsic can live directly in std::arch and be easier to use with the portability lint, but don't have the portability lint.

Answer 62 · 2020-08-06T16:38:41.000Z

Is there any word on when we can stabilize instrinsics like https://doc.rust-lang.org/core/arch/x86_64/fn.cmpxchg16b.html ?
I am running into some issues implementing some lockfree algorithms without it.

Answer 63 · 2020-08-06T22:10:24.000Z

Would stabilizing AtomicU128 (theoretically tracked in #32976) satisfy your use case, or is there some reason you specifically need the x86 intrinsic?

Answer 64 · 2020-08-06T22:35:08.000Z

That would do it as long as it has weak compare and exchange or compare and swap. I really just need a 128 bit compare and swap to fit a pointer and refcount. How is that implemented on archs like spark and ppc that don't support it that easily. LL/SC?

Answer 65 · 2020-08-06T22:36:27.000Z

AtomicU128 will only be available on targets that support it. AFAIK that's only x86_64 and AArch64.

Answer 66 · 2020-08-06T22:38:38.000Z

Ah, it could be theoretically implemented with doublewidth LL/SC on other architectures I think. Is that a possible thing to do?

Answer 67 · 2020-08-06T22:40:51.000Z

Only AArch64 has 2x64-bit LL/SC.

Answer 68 · 2020-08-27T23:57:28.000Z

Are the half-precision x86/64 functions intended to remain unstable? The compiler errors and the documentation points to this issue, but it was closed quite a while ago along with the stabilization PR.

EDIT: I also noticed that the f16c feature isn't reported in CARGO_CFG_TARGET_FEATURE in the stable compiler when it's explicitly requested: RUSTFLAGS="-C target-cpu=x86-64 -C target-feature=+sse3,+sse4.1,+avx,+f16c" cargo test. However, it does show up in nightly.

Answer 69 · 2020-09-01T03:08:10.000Z

I think someone just needs to send a stabilization PR for that feature. But first we need to ensure that all the intrinsics covered by the f16c feature are properly implemented.

Answer 70 · 2020-11-01T00:09:36.000Z

Any updates on stabilizing the F16C instructions?

Answer 71 · 2020-12-11T13:24:16.000Z

@novacrazy I don't think there's anything blocking F16C intrinsics, feel free to send a stabilization PR for them.

Answer 72 · 2020-12-18T21:34:13.000Z

There are four occurrences of #[unstable(feature = "stdsimd", issue = "48556")] in the codebase (this issue number is 48556). This seems to conflict with the fact that this issue is closed. Should these occurrences be referencing a different issue? See also: #76412

Answer 73 · 2020-12-24T12:01:42.000Z

I'm going to reopen this issue. SIMD was only stabilized on x86/x86_64, not on other architectures.

Answer 74 · 2021-01-15T07:37:29.000Z

I believe the FCP label should be removed — that was for something nearly three years ago.

Answer 75 · 2021-03-01T12:13:29.000Z

@Amanieu I see that the part about aarch64 and arm in stdarch is still unstable. The main reason for it is the lack of neon instructions? I can help add them if so.

Answer 76 · 2021-03-01T12:25:00.000Z

Yes that is the main reason why arm/aarch64 is still unstable.

Answer 77 · 2021-03-29T13:30:32.000Z

Is it possible to stabilize the existing instructions without waiting for everything to be done?

It looks like rust-lang/cargo#9181 is going to break the use of neon instrinsics in qcms so it would be nice to have a partial solution sooner rather than later.

Answer 78 · 2021-04-15T21:23:11.000Z

@jrmuizel you could send a PR stabilizing those intrinsics; I can't say whether it would be accepted, but if the intrinsics aren't planned to change it seems reasonable to me.

Answer 79 · 2021-04-19T01:06:46.000Z

I filed rust-lang/stdarch#1125 to see if it's possible.

Answer 80 · 2021-05-05T10:25:50.000Z

I'm happy to help out with SIMD issues as I have extensive knowledge of instruction sets.

Give me a ping if you want to discuss.

Answer 81 · 2021-10-27T14:37:41.000Z

Hi I wonder the current status of SIMD in Rust? I have to write SIMD on ARM chips (android and ios), so I wonder what should I do?

To my best knowledge, I should firstly switch to nightly channel, and then use https://github.com/rust-lang/packed_simd (or should I use https://github.com/rust-lang/portable-simd ?). Is this the suggest way, or is there a better approach?

Thanks for any suggestions!

Answer 82 · 2021-10-27T23:20:15.000Z

On the nightly channel you can use the NEON intrinsics in std::arch::aarch64/std::arch::arm.

Answer 83 · 2021-10-28T00:04:19.000Z

Thank you!

Answer 84 · 2022-01-08T21:22:33.000Z

Is there any roadmap or an overview of what remains to bring this to stable?

Answer 85 · 2022-01-08T21:24:40.000Z

the portable-simd repo has many tracking issues

https://github.com/rust-lang/portable-simd

Answer 86 · 2022-03-16T18:01:44.000Z

We discussed this in today's @rust-lang/lang meeting.

This seems to be the tracking issue for target-specific SIMD (not portable SIMD, which is being tracked and developed elsewhere). And we've shipped target-specific SIMD on x86-64 and aarch64. There will always be more CPUs to support, but that doesn't mean this issue needs to remain open indefinitely.

We've shipped this; closing.

Answer 87 · 2022-03-16T18:13:10.000Z

For reference the ARM32 stabilization is tracked in #90972

Answer 88 · 2022-03-18T19:40:33.000Z

The unstable book still links to this issue, which is now closed, but the stdsimd feature continues to exist as a catch-all for platform-specific intrinsics that are not stabilized.

Are there any issues tracking unstabilized features on supported platforms (e.g. aes on arm)?

Answer 89 · 2022-07-03T15:49:57.000Z

It's confusing that if you try to use AVX512 intrinsics on stable the compiler error references this closed issue. Is there an issue tracking AVX512 stabilization and would it be possible to reference it rather than this one?

Answer 90 · 2022-10-02T06:31:32.000Z

What is the state of this feature intended to be? It's not clear from the documentation. In particular:

This documentation page contains a warning that the float32x2_t type is a "nightly-only experimental API", and links to this issue
The source code linked from that doc page seems to suggest that the type is actually stable (no clue how the stability warning in the rendered documentation got generated...):
```
#[cfg_attr(target_arch = "aarch64", stable(feature = "neon_intrinsics", since = "1.59.0"))]
pub struct float32x2_t(pub(crate) f32, pub(crate) f32);
```
When I compile code using the most recent stable toolchain (specifically, cargo check --target=aarch64-unknown-linux-gnu), it succeeds

Answer 91 · 2022-10-09T17:36:09.000Z

@joshlf Looks like it's shown as stable on aarch64 and unstable on other archs. That's what the conditional cfg_attr on stable means. The docs are probably generated for x86_64. That's why you see the warning inside official docs. I am not sure what was really intended. It really does not make sense.

Answer 92 · 2023-02-25T00:29:08.000Z

What is the status of AVX 512 registers?

Answer 93 · 2023-05-31T18:19:15.000Z

This was closed in #48556 (comment) after a lang conversation, but given that this is still mentioned by unstable attributes (#48556 (comment)), I'm going to re-open it for libs-api, and ask them how they'd like to track those things under this (like https://doc.rust-lang.org/nightly/core/arch/x86_64/struct.__m512i.html).

(Basically I think that if the nightly docs still point here for something, it should probably be open until those things are stable or the mentions are changed to point at a different issue.)

Answer 94 · 2023-05-31T21:57:13.000Z

I've got a branch which adds proper tracking issues to all the intrinsics in stdarch, but it's still a work-in-progress.

Answer 95 · 2023-09-23T04:58:09.000Z

Can we get the AVX-512 intrinsics stabilized please?

I specifically need _mm_rolv_epi64 to avoid confusing codegen from llvm which caused me 4 hours of headaches trying to figure out, while folding down a shift+shift+or into a proper rotate.

Answer 96 · 2023-10-04T16:57:01.000Z

Not sure where else to ask: How would I find out about the state of (non-simd) WASM intrinsics? I'm specifically wondering whether the three memory_atomic_ fn's (two of them being required for parking_lot_core's thread parking) could be stabilized soon-ish.

Answer 97 · 2024-03-10T13:21:09.000Z

Some builds have been failing on nightly because of this, and just to check if I'm understanding everything correctly, this has been stabilized on nightly, is that correct? Which is why the feature attribute fails.

Answer 98 · 2024-03-10T23:13:23.000Z

The stdsimd feature has been split into sub-features: #27731 (comment)

Here is the full set of new tracking issues for what stdsimd was previous tracking:

Tracking issue for core::arch::{x86, x86_64}::has_cpuid #60123

Tracking issue for WebAssembly atomics #77839

Tracking Issue for std::arch::wasm64 #90599

Tracking Issue for AVX512 intrinsics #111137

Tracking Issue for Intel Restricted Transactional Memory intrinsics #111138

Tracking Issue for PowerPC arch intrinsics #111145

Tracking Issue for _MM_SHUFFLE #111147

Tracking Issue for stdarch_mips_feature_detection #111188

Tracking Issue for stdarch_arm_feature_detection #111190

Tracking Issue for stdarch_powerpc_feature_detection #111191

Tracking Issue for stdarch_riscv_feature_detection #111192

Tracking Issue for WebAssembly relaxed SIMD intrinsics #111196

Tracking Issue for MIPS arch intrinsics #111198

Tracking Issue for NVPTX arch intrinsics #111199

Tracking Issue for NEON intrinsics on 32-bit ARM #111800

Tracking Issue for RISC-V Ratified Extensions Intrinsics #114544

Tracking Issue for ARM CRC32 intrinsics #117215

Tracking Issue for AArch64 TME intrinsics #117216

Tracking Issue for AArch64 prefetch intrinsic #117217

Tracking Issue for ARM hint intrinsics #117218

Tracking Issue for ARM barrier intrinsics #117219

Tracking Issue for NEON FCMA intrinsics #117222

Tracking Issue for NEON I8MM intrinsics #117223

Tracking Issue for NEON dot product intrinsics #117224

Tracking Issue for AArch64 SHA3 intrinsics #117225

Tracking Issue for AArch64 SM3 and SM4 intrinsics #117226

Tracking Issue for NEON FRINTTS intrinsics #117227

Tracking Issue for 32-bit ARM DSP intrinsics #117237

I'm going to close this issue since it isn't tracking anything anymore.