rust-lang/rust

Consider deprecation of UB-happy `static mut`

eddyb opened this issue Β· 204 comments

eddyb commented

static mut is almost impossible to use correctly, see rust-lang-nursery/lazy-static.rs#117 for an example in the widely-used lazy-static.

You must be able to show that every borrow of the static mut is not reentrant (as opposed to regular interior mutability, which only requires reentrance-freedom when accessing data), which is almost entirely impossible in real-world scenarios.


We have a chance at removing it from Rust2018 and force people to use a proper synchronization abstraction (e.g. lazy_static! + Mutex), or in lieu of one, thread_local! / scoped_thread_local!.

If they were using static mut with custom synchronization logic, they should do this:

pub struct CustomSynchronizingAbstraction<T> {
    /* UnsafeCell / Cell / RefCell / etc. around data, e.g. `T` */
}
// Promise that proper synchronization exists *around accesses*.
unsafe impl<T: Sync> Sync for CustomSynchronizingAbstraction<T> {}

And then use CustomSynchronizingAbstraction with regular statics, safely.

This matches the "soundness boundary" of Rust APIs, whereas static mut is more like C.

cc @RalfJung @rust-lang/compiler @rust-lang/lang


2023-08 Note from triage, many years later: there's now https://doc.rust-lang.org/1.71.0/std/cell/struct.SyncUnsafeCell.html in the library, added by #95438, which can be used instead of static mut. But static mut is not even soft-deprecated currently.

I don't know at all whether we should do this or not atm. But to make this decision I have an...

...Idea: We should do a crater run to scrutinize how many legitimate usages there are and how many ones there are that have UB.

cc @rust-lang/wg-unsafe-code-guidelines -- we have this group just for this sort of thing =)

I'm 100% behind getting rid of static mut. Any attempt to use it which isn't UB is replaceable with something else which is safe, likely a Mutex or something thread local. Or, per the original comment, a custom alternate type implementing Sync directly.

We should clearly document the undefined behavior, but I don't think we should remove the ability to have (unsafe) global mutable locations directly, without any layer of indirection. That would make it more difficult to build certain types of lock-free synchronization, for instance.

eddyb commented

@joshtriplett But you can always use unsafe impl Sync + static?

Can you give an example of what you mean? You can't do much more with static mut than you could with a static UnsafeCell.

But you can always use unsafe impl Sync + static?

Yeah, and that's just as unsafe. So TBH I do not see the point.

eddyb commented

@RalfJung No, it's not, and I've explained why. With unsafe impl Sync, you only have to prove the data accesses correct, but with static mut, the references themselves can conflict.

EDIT: Now if "references exist" cannot be UB, ever, then this is not a problem, but IIUC, they do.
Also, you can't use a static mut from safe code, only a static. If the static mut is used correctly, why spread unsafety when you can wrap it in a sound API.

It's not actually different from an UnsafeCell if we replace each &var with unsafe {&*var.get()}, right? Then the aliasing rules would be the same; the difference is that with the UnsafeCell you can still keep references to the cell itself?

@eddyb Ah okay I see -- that has nothing to with with unsafe impl tough and everything with &UnsafeCell.

I agree &UnsafeCell is safer than static mut.

eddyb commented

@RalfJung Okay I should make it clearer that the unsafe impl is for "custom synchronization abstraction". I'll go edit the issue description.

If we could weaken static mut to only give you raw pointers, that could help...

... tough people would probably still turn them into references ASAP.

If they were using static mut with custom synchronization logic, they should do this:

How would they instantiate their custom type in stable Rust? User cons fns are unstable and even if we do stabilize min_const_fn that doesn't include anything that has bounds so even with that your example won't compile on stable.

To me this sounds like it would reduce what you can do in the 2018 edition. In the 2015 edition you can create static mut variables that contain primitive types but in the 2018 edition you can not do the equivalent unless I'm missing something like adding a RacyCell type with const constructor to core. (Here I'm assuming that it's not certain whether min_const_fn will make it into the edition release).

eddyb commented

@japaric Okay, that's a good point. I removed the bound from the struct definition, does that help? If you use it with a T: !Sync type you just can't put it in a static, but it should be fine otherwise.

Note that you can use an associated constant if you don't need arguments in your constructor (e.g. lazy-static already uses Lazy::<T>::INIT for this pattern).
Otherwise, yes, the min_const_fn requirement is a problem.

Depending on what you're doing you might be able to make your fields public, but as was in the case of lazy-static, that's not always the case.

cc @Centril @oli-obk Do we want to force people to replace their "custom synchronization abstractions" involving static muts with ones that require const fn?

In the 2015 edition you can create static mut variables that contain primitive types

What kinds of types? Atomics with relaxed ordering should probably be preferred either way even in single-threaded code, unless you have some sort of proof that interrupts are disabled.
(I'm assuming that you just can't trust anything to be single-threaded in user-space, in a library, and in kernel-space / embedded you have interrupts to worry about, wrt reentrance-safety)

like adding a RacyCell type with const constructor to core

Hmm, RacyCell<PubliclyConstructableType> would be okay, I think, you'd just have to implement methods on it (so you'd wrap it in another publicly-constructible-type, or use an extension trait).

Given pub struct Abstraction(pub RacyCell<AbstractionData>);, I think that &'static Abstraction is significantly safer than static mut but only if it gives you a *mut AbstractionData, not a &mut.

I wish the min_const_fn idea popped up sooner and we stabilized it already 😒.
Another 3 years of randomly finding unsound static mut uses doesn't sound fun.

eddyb commented

@rfcbot poll @rust-lang/libs @rust-lang/lang Should we add RacyUnsafeCell?

#[repr(transparent)]
pub struct RacyUnsafeCell<T>(UnsafeCell<T>);

unsafe impl<T: Sync> Sync for RacyUnsafeCell<T> {}

impl<T> RacyUnsafeCell<T> {
    pub const fn new(x: T) -> Self {
        RacyUnsafeCell(UnsafeCell::new(x))
    }
    pub fn get(&self) -> *mut T {
        self.0.get()
    }
}

Team member @eddyb has asked teams: T-lang, T-libs, for consensus on:

Should we add RacyUnsafeCell?

eddyb commented

@RalfJung What we want is &'static Abstraction, I think, and *mut only in methods of it.

That is, the user of the abstraction should be outside of the "unsafe zone" and be able to use only safe code to interact with the abstraction, otherwise it's not really a good abstraction.

@eddyb

In embedded we use static mut variables with interrupt handlers. These
handlers are invoked by the hardware and non-reentrant by hardware design (while
executing an interrupt handler the same handler will not be invoked again if
the interrupt signal is received. Also the devices are single core). So we have a
pattern like to let users add state to their interrupt handlers:

// we generate this function using macros
#[export_name = "SOME_KNOWN_NAME_TO_HAVE_THE_LINKER_PUT_THIS_IN_THE_RIGHT_PLACE"]
pub unsafe extern "C" fn interrupt_handler_that_has_some_random_name_unknown_to_the_user() {
    static mut STATE: usize = 0; // initial value provided by the user

    on_button_pressed(&mut STATE);
}

// the user provides this function along with the initial value for the state
fn on_button_pressed(count: &mut usize) {
    *count += 1;
    println!("Button has been pressed {} times", *count);
}

This way the end user indirectly uses static mut variables in a non-reentrant
fashion. The user can't call the interrupt handler themselves because the name
of the function is an implementation detail and if they call on_button_pressed
themselves there's still no problem because that won't operate on the hidden
static mut variable.

As there are no newtypes involved the user can use primitives and types
the define themselves (e.g. structs with pub(crate) fields) without requiring
the unstable const fn feature.

I'm not against this feature as long as the above pattern continues to work on stable Rust 2018.

If there's an API that @japaric and libs are happy with, I'm happy to remove static mut.

Personally, I think we should just deprecate static mut across the board, but not remove it from Rust 2018. We don't gain much by making it a hard removal, really -- we don't want to "make space" for some other meaning.

eddyb commented

@japaric Ah, if you're doing code generation, you can have your own version of RacyUnsafeCell with a public field and no const fn constructor, would that work for you?
UnsafeCell being !Sync isn't necessary, but it does avoid forgetting to impl !Sync.


There's a more powerful pattern I prefer using in those situations:

// Lifetime-invariant ZST token. Probably doesn't even need the invariance.
#[repr(transparent)]
#[derive(Copy, Clone)]
pub struct InterruptsDisabled<'a>(PhantomData<&'a mut &'a ()>);

// Public field type to get around the lack of stable `const fn`.
pub struct InterruptLock<T: ?Sized + Send>(pub UnsafeCell<T>);

// AFAIK this should behave like Mutex.
unsafe impl<T: ?Sized + Send> Sync for InterruptLock<T> {}

impl<T: ?Sized + Send> InterruptLock<T> {
    // This gives access to the `T` that's `Send` but maybe not `!Sync`,
    // for *at most* the duration that the interrupts are disabled for.
    // Note: I wanted to implement `Index` but that can't constrain lifetimes.
    fn get<'a>(&'a self, _: InterruptsDisabled<'a>) -> &'a T {
        unsafe { &*self.0.get() }
    }
}

// Note that you bake any of these into `InterruptLock` without `const fn`,
// because you would need a private field to ensure nobody can touch it.
// OTOH, `UnsafeCell<T>` makes *only constructing* the field public.
pub type InterruptCell<T: ?Sized + Send> = InterruptLock<Cell<T>>;
pub type InterruptRefCell<T: ?Sized + Send> = InterruptLock<RefCell<T>>;
#[export_name = "SOME_KNOWN_NAME_TO_HAVE_THE_LINKER_PUT_THIS_IN_THE_RIGHT_PLACE"]
pub unsafe extern "C" fn interrupt_handler_that_has_some_random_name_unknown_to_the_user(
    token: InterruptsDisabled,
) {
    on_button_pressed(token);
}

static COUNT: InterruptCell<usize> = InterruptLock(UnsafeCell::new(
    Cell::new(0),
));
fn on_button_pressed(token: InterruptsDisabled) {
    let count = COUNT.get(token);
    count.set(count.get() + 1);
    println!("Button has been pressed {} times", count.get());
}

There's another variation on this, where you make the token non-Copy and can do:

#[export_name = "SOME_KNOWN_NAME_TO_HAVE_THE_LINKER_PUT_THIS_IN_THE_RIGHT_PLACE"]
pub unsafe extern "C" fn interrupt_handler_that_has_some_random_name_unknown_to_the_user(
    token: InterruptsDisabled,
) {
    on_button_pressed(token);
}

static COUNT: InterruptLock<usize> = InterruptLock(UnsafeCell::new(
    0,
));
fn on_button_pressed(mut token: InterruptsDisabled) {
    // This mutable borrow of `token` prevents using it for
    // accessing any `InterruptLock`s, including `COUNT` itself.
    // The previous version was like `&token` from this one.
    let count = COUNT.get_mut(&mut token);
    *count += 1;
    println!("Button has been pressed {} times", *count);
}

(I didn't write the implementation details for the second version for brevity's sake, if desired I can edit this later)

You can even allow creating a InterruptsDisabled out of a &mut InterruptsDisabled, effectively sub-borrowing it (assuming you want to keep it ZST so it stays zero-cost).
Then you can combine the two versions by allowing the creation of the first version from a &InterruptsDisabled from the second version (but, again, only to keep it ZST).

Theoretically you can even create a InterruptsDisabled using HRTB and calling a closure after disabling interrupts, but this only works for the first version, not the second, stronger, one.
Also, IIRC, on some platforms it's impossible to truly turn all interrupts off.


IMO a safe abstraction like this is a good long-term investment, since it can handle any sort of Send + !Sync data that's accessed only from within interrupts.

cc @RalfJung Has anything like this been proven safe?

@eddyb note that the non-reentrancy that @japaric describes is for that particular interrupt handler (and any lower priority, but that’s hard to do anything useful with). Interrupts are not globally disabled during an interrupt handler on architectures like ARMv6-M. It might be possible to use a more complicated token scheme where each interrupt has its own token type, but I don’t know if anyone has looked into some way to make that actually usable.

eddyb commented

@Nemo157 Ah, that's a very interesting detail! Yeah you can just have one token type per level and use From impls to connect them up or something (and the methods in InterruptLock<Level, T> would take impl Into<InterruptsDisabled<Level, 'a>> instead).


Another idea that I came up with while discussing with @arielb1 on IRC:

We could only allow private static mut. So you can keep it e.g. in a module, or within a block inside a fn, const, or another static, etc. - but not export it out of there.

That, combined with deprecation of private static mut, could improve the situation.

In embedded we use static mut variables with interrupt handlers. These
handlers are invoked by the hardware and non-reentrant by hardware design

IIRC, AVR actually allows for recursive interrupt handlers:

#define ISR_NOBLOCK
ISR runs with global interrupts initially enabled. The interrupt enable flag is activated by the compiler as early as possible within the ISR to ensure minimal processing delay for nested interrupts.

This may be used to create nested ISRs, however care should be taken to avoid stack overflows, or to avoid infinitely entering the ISR for those cases where the AVR hardware does not clear the respective interrupt flag before entering the ISR.

Use this attribute in the attributes parameter of the ISR macro.

It's up to the programmer to choose (and thus appropriately handle) this case.

IIRC, AVR actually allows for recursive interrupt handlers:

On every architecture I'm aware of, you can explicitly set the interrupt flag after entering a handler; this is not AVR-specific. (On architectures with interrupt prioritization, like Cortex-M, you'll also need to manipulate priorities.) But the point here is that this needs to be done explicitly, with unsafe code; the architecture guarantees that you wouldn't reenter the ISR if you don't specifically request that.

So this is perfectly in line with the usual safety rules.

Some C APIs expose mutable globals that my Rust code right now access via:

extern "C" {
    pub static mut FOO: BAR;
}

Some C APIs require their users to define mutable globals, that the C library then accesses. Right now my Rust code interfaces with those using:

#[export(name = "foo")]
pub static mut FOO: BAR = ..initializer..;

How do I access mutable globals from C libraries, and how do I provide mutable globals to C libraries, without static mut ?

An example of a C library that uses both is jemalloc.

I don’t like this. It’s already unsafe. That’s good enough.

@alexreg your comment is somewhat ambiguous / overly brief... can you elaborate?

@gnzlbg Is there a reason that a cell will not work there?

@alercah the types I currently use there are typically libc::c_int and/or fn-types (for C function pointers). I guess that a non-mut static with an UnsafeCell can work there too.

I've just always used the exact same types that C does due to inertia: tooling like ctest, rust-bindgen, etc. can currently easily verify for those types that the ABIs do match. We can update the tooling to support cell types though.

@Centril Is it? All operations involving static mut are already unsafe. We’re thus not providing guarantees about it that aren’t upheld. So I don’t see any problem.

@alexreg What you are saying is that static mut is sound. That's correct. But that's not the topic of discussion here. We don't want to just push the problem on the programmers, not if we can do better. As a matter of fact, static mut is a big footgun because it lets you create very long-lived mutable references. We saw lazy_static doing that wrong, and it is written by very experienced Rust programmers. That's a datapoint for "maybe we can provide an API that's still unsafe but harder to use in an actually wrong way". static UnsafeCell is one such API.

Basically, unsafety is not an excuse for bad API design.


However, I see no reason to tie this to the edition. We can just deprecate static mut with a warning in all editions, and see if that works out, and then crank that warning up later (and then, maybe, use some future edition to make it a hard error). I wouldn't want to use an edition to go from "no complaint" to "hard error", and I see no reason to do so.

UnsafeCell is not Sync though, right? So we cannot just use it directly for a static?

UnsafeCell is not Sync, no, and that's correct: synchronization must be provided externally. If someone wants to use it in a static without internally-provided synchronization, then they could wrap it with e.g. struct ExternallySynchronizedCell<T> { cell: UnsafeCell<T> } unsafe impl<T> Sync for ExteranllySynchronizedCell<T>.

@RalfJung Nonetheless, I can imagine a static mut being useful in some occasions. What do you think it can always be replaced by, and why?

I believe it can be replaced by cells. The burden is on someone arguing that it can't to demonstrate why not.

It can be replaced by

struct SyncUnsafeCell<T>(UnsafeCell<T>);
unsafe impl<T> Sync for SyncUnsafeCell<T>;

static NAME: SyncUnsafeCell<T> = ...;

Should libcore provide a RacyUnsafeCell type?

Deprecating this would be less of a pain if there was a clear migration path towards something better like a note in the deprecation warning suggesting using RacyUnsafeCell instead.

Also, +1 on deprecating static mut first, and turning that into an error in the future. There is a lot of tooling in charge of interfacing with C (rust-bindgen, ctest, ...) that might need to start using RacyUnsafeCell instead of static mut for these types of things. cc @emilio

The burden is on someone arguing that it can't to demonstrate why not.

That's completely illogical. On the contrary, the burden is on the person arguing that it can be used to replace an existing feature. The status quo should be the default view. It seems @RalfJung may have made a good case tohugh.

That was already done in August though. Why are we rehashing it?

What was already done in August?

@RalfJung @alercah @gnzlbg would you perhaps be willing to make this actionable by writing an RFC with an "alternatives + deprecation in 2018 -> hard error in Edition.Next"-plan which also takes into account @japaric's concerns? Maybe y'all can collaborate on this?

I got two other RFCs I still want to write that I care about more strongly, so I don't think I'd get to this one any time soon -- sorry.

I also don't have time for this, unfortunately.

I don't see the point of doing this since all the accesses are unsafe by default. I don't know what the FFI story is for this is but please do consider it :)

I'd like to see a demonstration or at the very least hand-wavy explanation that the proposed generic replacement for static mut has equivalent-or-better performance characteristics. This is essential for any proposal to deprecate a feature.

Take a look at the IR for this example. I used FFI functions to prevent optimization of the loads and stores away. If you look at the IR, you'll see that both stores to the statics are optimized down to a single store instruction.

It's not a comprehensive proof, of course, but it demonstrates that the optimizer can figure this out.

@alercah Thanks. That's good to know. I'm somewhat less opposed to the proposal now, especially if this optimisation holds on all platforms... even though there still exist other reasons for maintaining it (namely uniformity and the fact has already opened the door to UB by using unsafe, so why bother).

#55207 (comment) proposes redefining UnsafeCell to be based on a new type that would be Copy (and become the lang item). That same type could also be Sync and play the role of RacyUnsafeCell.

eddyb commented

@alexreg UnsafeCell is supposed to be a zero-cost abstraction, to the point where any observable difference between T and UnsafeCell<T> is an outright codegen bug.

As for FFI, UnsafeCell and RacyUnsafeCell should be #[repr(transparent)], meaning that extern { static X: RacyUnsafeCell<T>; } should work as well as extern { static mut X: T }.

And statics use memory ABI anyway, so I think you could also use a racy version of Cell, or sync::Atomic{Usize,Ptr,...} types, to access FFI statics.


Anyway, we're not doing this in Rust 2018, and I agree with suggestions of general deprecation.

@eddyb Okay, fair enough then!

@eddyb asked me to note that const fn as of 1.31 will permit constructing structs with private fields where the struct is without constraints and where you don't impose additional constraints on the impl itself either.

@Centril I don’t understand what that means or how it relates to min_const_fn

@SimonSapin I'm not sure either; but apparently private struct fields in const fns were a blocker wrt. deprecating static mut somehow; and on beta you can do:

struct Foo {
    field: u8,
}

impl Foo {
    const fn bar() -> Self {
        Self { field: 0 }
    }
}

which apparently unblocks something? @eddyb can elaborate perhaps.

eddyb commented

One of the main points of discussion on this issue has been the lack of stable const fn which limits the ability to write constructors for abstractions to be used with statics (e.g. #53639 (comment)).

Instead of killing static mut entirely, what about just disallowing creating mutable references to them? The footgun here is accidentally creating a &'static mut.

With rust-lang/rfcs#2582, we can tell people that &mut STATIC_MUT as *mut _ is a way to create a raw pointer to a static mut, and we could even make that a safe operation. We could then deprecate creating mutable references that are not directly turned into raw pointers. Because static mut might be mutated at any time, it is probably a good idea to apply the same rule for shared references.

That still doesn't solve the footgun of any sort of concurrent modification of a static mut being UB because they're not atomic. Or even with just a single thread, creating a &T and then modifying the original static mut.

@retep998 Neither does removing static mut and recommending people to use RacyUnsafeCell<T> instead.

@retep998 note that creating a &T requires going through a raw pointer.

eddyb commented

@gnzlbg But RacyUnsafeCell<T> only lets you do the equivalent of &mut FOO as *mut T, without any references, mutable or immutable, to T, so it's at least as safe as a take-raw-pointer-only static mut - or is that your point?

One advantage is that a value of type &'static RacyUnsafeCell<T> is safer than a *mut T, as it's guaranteed to be a valid reference to an interior-mutable T, so the duration for which you interact with the T through *mut T tends to be shorter.

The much bigger advantage is that RacyUnsafeCell is just a stopgap solution, where the correct approach (now possible on stable in some cases because of const fn stabilization!) is to build a completely safe API around the T (rarely is this actually impossible, and if it is impossible, you should seriously reconsider whether what you're doing is sound at all).

@eddyb I was only referring to the issue that @retep998 mentioned. Just using RacyUnsafeCell does not really remove the footgun of potential data-races when mutating a static (one needs synchronization for that).

As you mention, using a RacyUnsafeCell does help with other issues that &mut T as *mut T does not address.

Or even with just a single thread, creating a &T and then modifying the original static mut.

@retep998 This is the case I've seen too.

note that creating a &T requires going through a raw pointer.

@RalfJung Do you mean in the current state of things or in proposed one? Currently it doesn't (although does require unsafe).

For example, if we take following Rust code:

pub fn f() -> u8 {
    let mut x = 10;
    
    let x_ref = &x;

    x = 20;
    
    *x_ref
}

Then, as expected, we get a borrow checker error due to mutation of variable while it's still borrowed immutably:

error[E0506]: cannot assign to `x` because it is borrowed
 --> <source>:6:5
  |
4 |     let x_ref = &x;
  |                  - borrow of `x` occurs here
5 | 
6 |     x = 20;
  |     ^^^^^^ assignment to borrowed `x` occurs here

But, if we use static mut, we can do:

pub unsafe fn g() -> u8 {
    static mut X: u8 = 10;

    let x_ref = &X;

    X = 20;
    
    *x_ref
}

without any errors and raw pointers involved, despite mutating an already-borrowed variable.

Do you mean in the current state of things or in proposed one? Currently it doesn't (although does require unsafe).

I meant with my proposal.

As suggested by the comments at #53639 (comment) and #53639 (comment) , there's no need to wait for an edition to pursue this. Deprecation of static mut could be considered as soon as alternatives are ergonomic enough to use, which is to say whenever const fn passes some minimum threshold of completeness (for want of which the insurgency against static mut was thwarted in pre-1.0 times). That threshold may arguably have already been passed; note that thanks to const fn improvements the venerable lazy_static! macro can nowadays be replaced by once_cell::Lazy (which in addition to being usable in more contexts is also more idiomatic and pleasant, IMO).

That still doesn't solve the footgun of any sort of concurrent modification of a static mut being UB because they're not atomic.

Lock-free algorithms exist that intentionally write through shared (raw) pointers without mutex synchronization.

comex commented

Those algorithms can and must use atomics. Racing non-atomic writes are UB.

@comex Such algorithms are often specifically designed to not use atomics, as atomic operations are slower and incur more synchronization overhead. (Racing non-atomic reads are more common in such algorithms than racing non-atomic writes, but cases of both exist.)

Speaking about architectures for a moment, not about languages: non-atomic reads or writes of word-sized word-aligned values have defined semantics.

It's certainly perfectly acceptable if such algorithms in Rust need to use a raw pointer rather than a reference, of course.

comex commented

That is what Ordering::Relaxed is for - it compiles to the same instruction as a normal load/store but has stronger semantics from the compiler's point of view.

Races without atomics are UB in both Rust and C.

I see; I thought you were implying "must use atomic operations". "must use atomics" does not to me immediately translate to "must use atomic types (but not necessarily atomic operations)".

comex commented

*shrug* Even a Relaxed load/store is an atomic operation in the literal sense that it cannot tear. That's not guaranteed for normal loads/stores, even if the architecture normally guarantees it, because the compiler is allowed to split them. But I'm getting off topic.

Deprecation of static mut could be considered as soon as alternatives are ergonomic enough to use, which is to say whenever const fn passes some minimum threshold of completeness (for want of which the insurgency against static mut was thwarted in pre-1.0 times).

What dos const fn have to do with static mut?

@joshtriplett

Racing non-atomic reads are more common in such algorithms than racing non-atomic writes, but cases of both exist

Racing non-atomic reads and writes are both UB in Rust in all circumstances (where "non-atomic read" is "any read not performed via one of the atomic-read intrinsics").

Speaking about architectures for a moment, not about languages: non-atomic reads or writes of word-sized word-aligned values have defined semantics.

As you are aware though, that argument is entirely irrelevant when talking about Rust. Non-aligned reads or writes are also well-defined on x86 and still UB in Rust even when targeting x86.

I thought you were implying "must use atomic operations".

Yes I think they were. What is an "atomic operation" for you?

But also, @joshtriplett do you have any example where using static mut for concurrency works better / is more ergonomic than using static + interior mutability (and then maybe an Atomic* type)? The latter should always be preferred. In particular, the Atomic* types are the only way we expose on stable to perform atomic operations.

I see; I thought you were implying "must use atomic operations".

@joshtriplett That's what they were actually implying?

"must use atomics" does not to me immediately translate to "must use atomic types (but not necessarily atomic operations)".

There is no difference. Atomic types are just type wrappers that provide methods for doing atomic operations, there is nothing special about them, e.g., we could add such methods to raw pointers.

Atomic operations are just operations that happen as a unit (no tearing). Their Ordering argument just states how these must be ordered relative to each other, and this determines whether they synchronize with each other in certain ways or not.

This has something to do with the machine code that gets generated (e.g. the compiler can only re-order them in certain ways), but ... for example, on x86, Relaxed, Acquire, SeqCSt loads are all just a mov instruction. So when you say:

Such algorithms are often specifically designed to not use atomics, as atomic operations are slower and incur more synchronization overhead.

what I read is that apparently there are people who prefer to use algorithms that have data-races because they think they can do better than a single mov instruction. I mean, they aren't wrong, we could remove all execution paths leading to those algorithms, making their code infinitely faster than a single mov instruction πŸ˜† I'm however not sure they would like that πŸ˜†

FWIW you are right that, in general, some atomic orderings generate machine code that's more expensive than others. But e.g. on x86 a SeqCst store generates xchg, while all other atomic load and stores only generate a single mov, so the generalization that "all atomic operations are slower" is definitely false - some are, if by some you mean ~1 out of ~10. That's because if your architecture is relatively sane, chances are that load / store hardware instructions already synchronize in a sane way as well. None of this beats code with UB though, for programs with UB, we are not even required to generate a working executable, and the fastest program is obviously the one that never runs at all πŸ˜† .

Atomic operations are just operations that happen as a unit (no tearing).

Note that this is a hardware-centric view of things; for the purposes of Rust, atomic operations are operations on which race conditions are permitted. There is no "tearing" in the Rust Abstract Machine.

In particular, while a u8 read cannot actually tear, it is still non-atomic unless performed with one of the atomic intrinsics. So, I think it is wrong to equate "atomic" and "no tearing".

FWIW you are right that, in general, some atomic orderings generate machine code that's more expensive than others. But e.g. on x86 a SeqCst store generates xchg, while all other atomic load and stores only generate a single mov, so the generalization that "all atomic operations are slower" is definitely false - some are, if by some you mean ~1 out of ~10.

Atomic operations are still slower though even when targeting x86 because they get less optimized. This is why sometimes people resort to UB and use non-atomic accesses. I know of only one such case though, and it has nothing to do with this thread.

Note that this is a hardware-centric view of things; for the purposes of Rust, atomic operations are operations on which race conditions are permitted. There is no "tearing" in the Rust Abstract Machine.

Doesn't the abstract machine also guarantee that races using atomic operations will not observe partial modifications?

In particular, while a u8 read cannot actually tear, it is still non-atomic unless performed with one of the atomic intrinsics. So, I think it is wrong to equate "atomic" and "no tearing".

Yes I agree.

Such algorithms are often specifically designed to not use atomics, as atomic operations are slower and incur more synchronization overhead.

what I read is that apparently there are people who prefer to use algorithms that have data-races because they think they can do better than a single mov instruction. I mean, they aren't wrong, we could remove all execution paths leading to those algorithms, making their code infinitely faster than a single mov instruction πŸ˜† I'm however not sure they would like that πŸ˜†

@gnzlbg To give an actual example I suspect this could be referencing algorithms such as Hogwild, which, while extremely UB even in C, is apparently used to great effect in the ML field. I do happen to know at least one ML startup that has implemented this algorithm in Rust and has told us, to our horrified and confused faces, that it works as expected. Which is absolutely not to say that Rust should try to support this use case, of course--some things are just a bridge too far. :)

But we may be getting off-topic here...

What dos const fn have to do with static mut?

@RalfJung The failed pre-1.0 rebellion against static mut was instigated by @eddyb's manifesto, in which the inability to call constructors in static contexts is the largest unresolved question. You'll have to ask eddyb whether or not CTFE has since progressed far enough to satisfy their envisioned use cases.

comex commented

I took a (very) quick look at the Hogwild paper, and it looks like it's based on compare-and-swap, which isn't even possible to use (in any language) without atomics. If there's some part of it that uses standard loads and stores, then, as I said, you can use relaxed atomics to generate identical assembly. Rust should and does support this use case.

Doesn't the abstract machine also guarantee that races using atomic operations will not observe partial modifications?

There is no concept of partial modifications, so I guess you could say that -- but it's a strange way of saying things.

Something that translates to a hardware atomic operation, such as a lock
or xadd.

There is no operation in Rust that is guaranteed to translate to a hardware atomic operation. ("atomic volatile" would provide that guarantee, but we don't have that.) Even when you use Atomic*, the compiler is allowed to compile that to non-atomic accesses if it can prove that there are no races (e.g. because the address does not escape).

But I agree that this seems to mostly be a terminology difference.

To give an actual example I suspect this could be referencing algorithms such as Hogwild, which, while extremely UB even in C, is apparently used to great effect in the ML field.

I agree with @comex that I see nothing here that wouldn't work with relaxed operations?
The only deliberate use of non-atomics (causing UB in C and Rust) that I am aware of is seqlocks.

(Also it took me a long while here to realize that "ML" is not this ML... after all this is a programming language forum here. ;)

static mut is almost impossible to use correctly

May be I don't undernstad something, but this is not true.

Let's take for example JNI aka Java Native Interface, this is the way to make possible to call "native" code via C API from Java.

The common way to deal with this API is to cache references to Java classes/methods in function JNI_OnLoad in your cdylib crate, un-cache inJNI_OnUnload and safely use these cached references between call of these function. This functions are called from JVM (Java virtual machine).

This is like constructor/destructor and methods. You have garantee that JNI_OnLoad (constructor) will be called before other methods. So you can be sure that you write variables before read. And it would be huge waste of CPU resources if you check some atomic variables or uses mutexes after JNI_OnLoad and before JNI_OnUnload to access cached references.

I suppose the say issue relevenant for many plugins systems if you decide to write them on Rust instead of C/C++ and you want speed comparable to C/C++.

I’m by no means a person to debate Java implementations, but a quick search suggests that JVM does not synchronize calls to JNI_On{,Un}Load. If this is indeed true, that would make your example racy without additional locking and therefore an example of incorrect use.

The cost of using an atomic is negligible especially when there is no contention (and if there was contention then not using an atomic would be UB) and even more so when it is a relaxed atomic (literally exact same instruction as a non-atomic access on x86).

Also Java inherently makes all reads and writes relaxed atomic operations by default to prevent data races.

@nagisa @retep998

Your comments looks odd. May be I need example:

Rust code:

#[no_mangle]
pub extern "system" fn JNI_OnLoad(java_vm: *mut JavaVM, _reserved: *mut ::std::os::raw::c_void) {
       write static mut
}

#[no_mangle]
pub extern "C" fn Java_org_example_Foo_do_1f1(
    env: *mut JNIEnv,
    _: jclass, 
)  {
read static mut
} 

JNI_Onload is during so/dll/dylib load, so it called before JVM call other methods, like Java_org_example_Foo_do_1f1 which is Foo.do_f1 on Java language. So there is no need for synchronization. The other methods may be called from different threads, this is depend on Java program, but they have only read-only access to global variables. So, yes, if you use atomics it would be huge waste, because of all methods will access global variables read then not write, but it is impossible for compiler remove them away because of it is plugin so LTO is impossible. So in compare to C/C++ implementation without statuc mut it would be very slow code.

In other words consider such C/C++ program, JVM actually also C++ program:

int main()
{
init_plugin();
// create many threads and call functions from plugins execept
// init_plugin and deinit_plugin
stop_all_threads();
deinit_plugin();
}

To make your Rust plugin as fast as possible you obviously want to just read global variables without any atomic/mutex in code between init_plugin and deinit_plugin if you write these global variables only in init_plugin and deinit_plugin.

@retep998

The cost of using an atomic is negligible especially when there is no contention

This is obviosly not true, each access would contains check like this:

if (!read_atomic_is_initialized)

if the cost of it would be negletable, then cost of bound check for array access would be also negeltable, this is also just check:

if (i < len)

but bounds checking sometimes cause 2x slow down, so why read only atomic check can not cause such problem?

@davemilter

What atomic check are you talking about? Atomics are simple primitives that can be initialized at compile time, so there's no initialization checks on atomics. A relaxed read from an atomic is literally the same instructions on x86 as a read from a non-atomic!

@davemilter

Deprecating static mut would in no way invalidate your use case (or make it less performant). You wouldn't have additional runtime checks, but instead you can encode your unsafe expectations in your code:

static FOO: AtomicPtr<JavaVM> = Atomic::new(ptr::null_mut());

#[no_mangle]
pub extern "system" fn JNI_OnLoad(java_vm: *mut JavaVM, _reserved: *mut ::std::os::raw::c_void) {
       FOO.store(java_vm, Ordering::Relaxed)
}

#[no_mangle]
pub extern "C" fn Java_org_example_Foo_do_1f1(
    env: *mut JNIEnv,
    _: jclass, 
)  {
    let java_vm = FOO.load(Ordering::Relaxed);
    (*java_vm).do_stuff(); // UB if `java_vm` is null
    // but that's impossible since `JNI_OnLoad` is always called before
} 

The additional atomic operation requires zero additional overhead on x86 and even on platforms where it would have some overhead, that is negligible (especially compared to the function call that just happenend in order to get here). The value would already be in the local CPU's cache if used frequently and since you never write to FOO except during initialization, it would never get invalidated.

@retep998

What atomic check are you talking about?

I am talking about std::sync::Once or analog from crate on crates.io that can be used
instead of lazy_static or explicit static mut

@oli-obk

The additional atomic operation requires zero

Actually real code looks like:

static mut CLASS1: jclass = ::std::ptr::null_mut();
...
static mut FIELD1: jfieldID = ::std::ptr::null_mut();
...
static mut METHOD1: jmethodID = ::std::ptr::null_mut();

And you can use these variables in loop.

For example if you want to call callback (Java call Rust code and Rust code call Java callback)
in the loop you need access this cached variables in the loop.

struct JavaCallback {}
trait Foo {...}
impl Foo for JavaCallback {
    fn foo(&self) {
        access to jmethodID and jclass
    }
}

...
for _ in 0..1000 {
    call Foo::foo
}
...

So this is not "operation" these are operations and may be operations in the loop.
And let's suppose this is on Android arm or aarch64.

You can use a single static for all values and you can fetch the value once in front of the loop. Also as I said before, your function call will be the expensive operation, not the synchronized fetch of a global. Just because the non x86 platforms use special instructions instead of the regular unsychronized ones doesn't mean they are expensive. Would you be so kind and create a small benchmark showing the slowdown so we have some actual code to talk about?

Also: ignoring whether your static mut use case is UB or not (I haven't looked closely, but it seems it's at least easy to get wrong), you can get the exact same behaviour back by using a struct MyUnsoundFoo<T>(UnsafeCell<*mut T>); and manually implementing Sync for MyUnsoundFoo. You can then access the memory of a static via the UnsafeCell. This way you have exactly the behaviour (defined or not) that you have with static mut, without static mut having to exist.

This workaround is suggested in the original post of this issue, so I'm going to fold the discussion as resolved

With rust-lang/rfcs#2582 accepted, I think the best way forward is, after &raw gets stabilized, to lint/deprecate taking references to static mut, but to still allow raw pointers.

So it's basically a speed-bump. One can still write &mut *&raw mut STATIC, but one can also do &mut *STATIC.get() when using a non-mut UnsafeCell -- in unsafe code there'll always be enough rope for the user to hang themselves, but making them go through raw pointers should significantly reduce the risk of that happening accidentally.

We discussed this in our meeting on 2019-10-10. In that meeting, we concluded that we generally like a proposal that @RalfJung put forward -- we would deprecate taking references to static mut, but permit &raw operations that give you raw pointers. We think this should be done through an RFC.

I was looking through the OnceCell docs today, and noticed that one of its examples uses static mut. I remembered this issue and wondered whether that example is still the recommended approach. Or should it be rewritten to use UnsafeCell instead?

I'd argue that should be rewritten using UnsafeCell. We should not be encouraging anyone to use static mut even in cases where it is sound.

In servo/servo#26550 I looked at the uses of static mut in the Servo repository. Some can be replaced with a safe abstraction like AtomicPtr, Mutex, or once_cell::sync::Lazy.

But in some cases the reasoning that convinces us that UB does not happen cannot be encoded in the type system / borrow checker, so the correct way to avoid static mut today is to define our own RacyUnsafeCell / SyncUnsafeCell.

I think it would be good to have this in libcore in order to make static mut less tempting. This thread so far seems to have some consensus in favor. @rust-lang/libs, any preference on the name?

This new type would:

  • Have a T: ?Sized parameter
  • Wrap UnsafeCell<T>
  • Be #[repr(transparent)]
  • Have methods and trait impls identical to those of UnsafeCell<T>, except
  • impl Sync where T: Sync

Like with UnsafeCell, it is the responsibility of callers of the get method to figure out when it is safe to dereference pointer. Docs would discuss the need for external synchronization when a cell is reachable from multiple threads, such as in a static item.

If this were before Rust 1.0 it might be viable for this type to be UnsafeCell, rather than have two almost-identical types. However at this point there is likely unsafe code in the wild that relies on an UnsafeCell field to make a type !Sync. We shouldn’t break that.

I would be on board with a sync variation of UnsafeCell. RacyUnsafeCell or SyncUnsafeCell both seem fine.

To clarify, what you mean by "in some cases the reasoning that convinces us that UB does not happen cannot be encoded in the type system / borrow checker" is that you need code able to pass around &RacyUnsafeCell to other code and static mut does not provide any way of accomplishing that?

On names, two alternative possibilities to evaluate would be something like UnsafeSync<UnsafeCell<T>> / UnsafeCell<UnsafeSync<T>> and something with a new defaulted type parameter like UnsafeCell<T, Sync>. Neither of these is obviously better but worth considering. Both make it more straightforward to make code that works the same for sync and non-sync UnsafeCell.

I just used a static mut and I should be using a RacyUnsafeCell instead, so I'm empathetic with the motivation. But I also realize this use case is extremely rare. Why shouldn't we just deprecate static mut in 2021 but leave RacyUnsafeCell for other people to define / for the ecosystem? I think the motivation Simon describes goes away (make static mut less tempting) if we just get rid of this misfeature entirely?

To clarify, what you mean by "in some cases the reasoning that convinces us that UB does not happen cannot be encoded in the type system / borrow checker" is that you need code able to pass around &RacyUnsafeCell to other code and static mut does not provide any way of accomplishing that?

servo/servo#26550 is about code that currently uses static mut, and replacing it with static + some kind of interior mutability. &RacyUnsafeCell can be taken from the static item, no need to pass it around.

I mean reasoning such as β€œIt it sound for once_cell::sync::Lazy<T> to give out &'static T references, because it only does so after ensuring (with synchronization) that initialization code has finished running, and nothing will ever take a &mut T reference afterwards.”

once_cell can rely on private struct fields in order to encapsulate unsafe code and provide a safe abstraction. The type system and borrow checker help ensure that users of once_cell can never trigger UB without writing unsafe code themselves, unless once_cell is buggy.

In some cases it is not practical to come up with such an abstraction that can be defined by a small module. Rather, we have code all over the place that needs to maintain some invariants. Maybe such code can be improved but that would take a big rewrite. It’s not a good situation but such is life. Code might look like:

static FOO: SyncUnsafeCell<Foo> = /* … */;

fn something() {
    // Here we know from external synchronization that no other thread is accessing `FOO`
    unsafe {
        let foo: &mut Foo = &mut *FOO.get();
        // …

        // End of exclusive borrow of FOO
    }

    // …

    // Here we know from external synchronization that no other thread
    // wants exclusive access to `FOO` (but they might take shared references)
    unsafe {
        let foo: &Foo = &*FOO.get();
        // …

        // End of shared borrow of FOO
    }
}

External synchronization might be things like: this crate-private function is only ever called from this particular thread.

On names, two alternative possibilities to evaluate would be something like UnsafeSync<UnsafeCell<T>> / UnsafeCell<UnsafeSync<T>> and something with a new defaulted type parameter like UnsafeCell<T, Sync>.

For UnsafeSync separated from UnsafeCell to be useful it would have to be Sync unconditionally, but with a "dual-purpose" SyncUnsafeCell type we can have impl Sync for SyncUnsafeCell<T> where T: Sync with a bound. Now that I’ve typed this out I’m not sure which is preferable.

A type parameter seems less nice to me as it would require at least two new public types (for the two values of the parameter) instead of one. I think ease of standard library maintenance should be secondary to having the public API we prefer, and we can hack things with macros internally to deduplicate definitions.

Why shouldn't we just deprecate static mut in 2021 but leave RacyUnsafeCell for other people to define / for the ecosystem? I think the motivation Simon describes goes away (make static mut less tempting) if we just get rid of this misfeature entirely?

No, those use cases don’t go away. I don’t see why libcore should not help with them just because the more dangerous option is deprecated.

And full removal can only happen at best at the next edition. We could add RacyUnsafeCell in 1.45.

No, those use cases don’t go away. I don’t see why libcore should not help with them just because the more dangerous option is deprecated.

But we're talking about a newtype wrapper with a single marker impl. I don't think it's such a big help to those rare use cases. The problem is we've provided syntax in the language for something even less safe than "RacyUnsafeCell" that people just shouldn't use.

Part of the reason to put it in core is so that you can then document it in core with accurate safety guidelines that are known to be trusted because they're kept in core and not foo-random-crate.

Its an unsafe Sync impl on UnsafeCell, the safety guidelines are exactly the same as the safety guidelines for manually implementing Sync. The code is trivial to write and all the onus of safety is on the user whether its in core or not.