rust-lang/rust

`cfg_target_feature` and `target_feature` don't interact properly

parched opened this issue ยท 18 comments

This should panic but it doesn`t

#[target_feature = "+avx"]
pub fn should_panic() {
#[cfg(target_feature = "avx")]
    panic!("have_avx");
}

I would like for this to work because I have a macro that generates 2 copys of a function, one with #[target_feature = "+feat"] and one without and I want to conditionally use some assembly when the feature is available.

CC #29717

It does panic.

#![feature(cfg_target_feature, target_feature)]

#[target_feature = "+avx"]
fn should_panic() {
    #[cfg(target_feature = "avx")]
    panic!("have_avx");
}

fn main() {
    should_panic();
}
$ rustc -C target-feature=+avx rust42515.rs
$ ./rust42515 
thread 'main' panicked at 'have_avx', rust42515.rs:6
note: Run with `RUST_BACKTRACE=1` for a backtrace.

If you don't compile with the appropriate target-feature settings, then the cfg is not set and the program (correctly) does not panic. I think you can think of -C target-feature=+avx similarly as gcc's -mavx flag.

But #[target_feature = "+avx"] should set it for that function without passing anything extra on the command line.

What do gcc and clang do? Can you explain more about your use case and why you expect this to happen?

Well gcc and c only have the preprocessor macros so that would behave like cfg_target_feature currently does. However, I assumed rust attributes are more context aware.

My use case is for this procedural macro attribute I've written https://github.com/parched/runtime-target-feature-rs which generates 2 copies of the function, one with #[target_feature = "+feat"] and one without. I would like to have some conditional code based on which copies is being compiled.

@parched gcc and Clang have the equivalent of #[target_feature] as well, e.g., __target__("avx") can be applied to functions. Both gcc and Clang support that. So I would be very interested to see how their preprocessor macros interact with __target__("avx") (if at all).

I assumed rust attributes are more context aware

To give you some context, the #[target_feature] attribute is highly experimental. Its only purpose is to expose an LLVM feature directly so that we can experiment with it to help move along the SIMD stabilization effort. What its iteraction with #[cfg(target_feature)] is isn't clear at this time, so detailed use cases would be very helpful.

I would like to have some conditional code based on which copies is being compiled.

But #[target_feature] isn't a conditional compilation flag. You need to do some kind of runtime dispatch using cpuid to figure out which function to execute. I still don't understand the use case for using cfg(target_feature) inside a #[target_feature] function.

More explicitly the macro I've written, runtime_target_feature turns

#[runtime_target_feature("+avx")]
pub fn sum(input: &[u32]) -> u32 {
    #[cfg(target_feature = "avx")]
    { /* write some assembly using avx */ }

    #[cfg(not(target_feature = "avx"))]
    { /* fallback code */ input.iter().sum()}
}
pub fn sum(input: &[u32]) -> u32 {
    pub extern crate runtime_target_feature_rt as rt;

    static PTR: rt::atomic::Atomic<fn (&[u32]) -> u32> = rt::atomic::Atomic::new(setup);

    fn setup(input: &[u32]) -> u32 {
        let chosen_function = if rt::have_avx( ) { // have_avx reads cpuid
            enable_avx
        } else {
            default
        };
        PTR.store(chosen_function, rt::atomic::Ordering::Relaxed);
        chosen_function(input)
    }

    fn default(input: &[u32]) -> u32 {
        #[cfg(target_feature = "avx")]
        { /* write some assembly using avx */ }

        #[cfg(not(target_feature = "avx"))]
        { /* fallback code */ input.iter().sum()} // I would like this compiled here
    }

    #[target_feature = "+avx"]
    fn enable_avx(input: &[u32] ) -> u32 {
        #[cfg(target_feature = "avx")]
        { /* write some assembly using avx */ } // but this compiled here

        #[cfg(not(target_feature = "avx"))]
        { /* fallback code */ input.iter().sum()}
    }

    PTR.load(rt::atomic::Ordering::Relaxed)(input)
}

@BurntSushi I mean they only have macros for detecting the target features so the couldn't possibly change inside a function with __target__("avx") which is annoying. It would be great if rust cfg could be smarter than that though :)

@parched Sorry, but I still don't understand your use case. What problem are you trying to solve at a high level? For example, having a cfg(not(target_feature = "avx")) inside a #[target_feature = "+avx"] function just doesn't make any sense to me at all. If you wind up calling a #[target_feature = "+avx"] function when avx isn't actually enabled, then you're in trouble no matter what.

And I still don't understand what's fundamentally different between Rust's system and gcc's/Clang's. I think you need to back way up and help me try to understand what specific problems you're facing.

TimNN commented

@BurntSushi: I think the use case here is a matter of convenience: Write a function in which some parts are marked #[cfg(target_feature = "avx")] or #[cfg(not(target_feature = "avx"))], then use a macro to create two versions (copies) of that function, one with #[target_feature = "+avx"] and one without, along with supporting code to choose the correct one at runtime.

The difference to gcc/Clang is, as I understand it, that since the preprocessor runs before any c parsing happens it cannot know if it is inside a function marked __target__("avx"), whereas the rust compiler should be / is / could be well aware that it's currently looking at a #[cfg(target_feature = "avx")] inside a function marked #[target_feature = "+avx"].

@TimNN Thanks for explaining that. It makes a little more sense, although I don't understand why the macro can't just strip out the cfg'd code that's never going to execute.

Popping up a level, I'm not particularly familiar with this aspect of the compiler (can arbitrary attributes impact the resolution of cfgs?) and I don't know if we have any precedent for it.

Yes just what TimNN says. @BurntSushi I was actually just thinking that on my bike ride home and that would definitely work. It would mean the macro would need to know all the target features and their hierarchy. I.e., if the feature enabled was sse3 the macro would have to know to strip out any cfg not sse2.

Does the following help?

#[inline(always)] 
fn foo_impl()  {
  // ...generic implementation here...
  #[cfg(traget_feature = "+avx")] {
     // some avx specific optimizaiton here
  }
  #[cfg(not(target_feature = "+avx"))] {
    // some non-avx generic alternative here
  }
  // ... more generic implementation ...
}

#[target_feature = "+avx"] fn foo_avx() { foo_impl() }
#[target_feature = "+sse3"] fn foo_sse3() { foo_impl() }

That is, could we propagate the target features of foo_avx and foo_sse3 somehow (handwaiving) to foo_impl ?

The trivial case of this is what @parched proposed.

Propagating through function calls as @gnzlbg described is entirely incompatible with how #[cfg] works (they are processed early during compilation, certainly before name resolution). What @parched suggested, restricted to lexical nesting, could theoretically be made work, but requires special treatment of the target_feature cfg -- today, cfgs are crate-global. Therefore it would be a weird special case, which seems rather unappealing.

Edit: Sorry, the following doesn't make sense

Regarding the application @TimNN mentioned, note that one can pass the target_feature attributes to the macro, i.e., instead of macro invocations like

#[target_feature(enable="foo")]
create_fn!(some, args);

one could have:

create_fn!(#[target_feature(enable="foo")], some, args);

Therefore it would be a weird special case, which seems rather unappealing.

Yes, but IMO worth it.

An alternative would be to make #[target_feature] do the preprocessing like how I plan (someday) to make my procedural macro do it to work around this.

From the discussion here, it seems like the code-gen-time context-aware behaviour of cfg!(target_feature = "...") as per RFC 2045 (to statically resolve to true or false based on the target features in the current context) is off the cards. Is that correct? The only way I can think of implementing it reliably would be a new LLVM intrinsic.

I wouldn't necessarily say there is no design that can make this work, but it is definitely is going to require a large amount of work to make it happen, and would most likely need to happen across an edition boundary.

The most appropriate way to get a similar, but not equivalent, behaviour would be to use the is_*_feature_detected macros. These, sadly, share a similar limitation in that they do not unconditionally resolve to a constant true in functions that have a feature enabled (which is not enabled globally). However, unlike with cfg!() and #[cfg()], adjusting the behaviour of these feature_detected macros wouldn't be as precarious.