Tracking issue for `thread_local` stabilization
aturon opened this issue ยท 79 comments
The #[thread_local]
attribute is currently feature-gated. This issue tracks its stabilization.
Known problems:
-
#[thread_local]
translates directly to thethread_local
attribute in LLVM. This isn't supported on all platforms, and it's not even supported on all distributions within the same platform (e.g. macOS 10.6 didn't support it but 10.7 does). I don't think this is necessarily a blocker, but I also don't think we have many attributes and such which are so platform specific like this. - Statics that are thread local shouldn't require
Sync
- #18001 - Statics that are thread local should either not borrow for the
'static
lifetime or should be unsafe to access - #17954 - Statics can currently reference other thread local statics, but this is a bug - #18712
- Unsound with generators #49682
- static mut can be given
'static
lifetime with NLL (#54366)
@alexcrichton Can you elaborate on the blockers?
Certainly! Known issues to me:
#[thread_local]
translates directly to thethread_local
attribute in LLVM. This isn't supported on all platforms, and it's not even supported on all distributions within the same platform (e.g. 10.6 doesn't support it but 10.7 does). I don't think this is necessarily a blocker, but I also don't think we have many attributes and such which are so platform specific like this.- Statics that are thread local shouldn't require
Sync
- #18001 - Statics that are thread local should either not borrow for the
'static
lifetime or should be unsafe to access - #17954 - Statics can currently reference other thread local statics, but this is a bug - #18712
That's at least what I can think of at this time!
A note, we've since implemented cfg(target_thread_local)
which is in turn itself feature gated, but this may ease the "this isn't implemented on all platforms" worry.
Hi! Is there any update on the status? Nightly still requires statics to be Sync
. I tried with:
rustc 1.13.0-nightly (acd3f796d 2016-08-28)
binary: rustc
commit-hash: acd3f796d26e9295db1eba1ef16e0d4cc3b96dd5
commit-date: 2016-08-28
host: x86_64-unknown-linux-gnu
release: 1.13.0-nightly
@alexcrichton Any news on #[thread_local]
becoming stabilized? AFAIK, at the moment it is impossible on DragonFly to access errno
variable from stable code, other than directly from libstd. This blocks crates like nix
on DragonFly, which want to access errno as well, but libstd is not exposing it, and stable code is not allowed to use feature(thread_local).
@mneumann no, no progress. I'd recommend a C shim for now.
@alexcrichton thanks. I am doing a shim now https://github.com/mneumann/errno-dragonfly-rs.
The optimizations are too aggressive ;)
See this code:
#![feature(thread_local)]
#[thread_local]
pub static FOO: [&str; 1] = [ "Hello" ];
fn change_foo(s: &'static str) {
FOO[0] = s;
}
fn main() {
println!("{}", FOO[0]);
change_foo("Test");
println!("{}", FOO[0]);
}
The compiler does not detect the side effect in change_foo
and removes the call in release. The output is:
Hello
Hello
cc @eddyb, @Boiethios your example shouldn't actually compile because it should require static mut
, not just static
It compiles with the last nightly rustc.
Oh, drat, this is from my shortening of the lifetime, i.e
rust/src/librustc/middle/mem_categorization.rs
Lines 657 to 662 in dead08c
@nikomatsakis what should we do here? I want a static lvalue, with a non-
'static
lifetime.There's some emulation clang
does IIRC, that we might want to do ourselves, to support #[thread_local]
everywhere.
And there's #47053 which results from my initial attempt to limit references to thread-local statics to the function they were created in.
@cramertj I've personally been under the impression that we're holding out on stabilizing this for as long as possible. We've stabilized very few (AFAIK) platform-specific attributes like this and I at least personally haven't ever looked to hard into stabilizing this.
One blocker (in my mind at least) is what @eddyb mentioned where this is a "portable" attribute yet LLVM has a bunch of emulation on targets that don't actually support it (I think MinGW is an example). I don't believe we should allow the attribute on such targets, but we'd have to do a lot of investigation to figure out those targets.
Is there motivation for stabilizing this though? That'd certainly provide some good motivation for digging out any remaining issues and looking at it very closely. I'd just personally been under the impression that there's little motivation to stabilize this other than it'd be a "nice to have" in situations here and there.
I am using #[thread_local]
in my code in a no_std
context: I allocate space for the TLS segment and set up the architecture TLS register to point to it. Therefore I think that it is important to expose the #[thread_local]
attribute so that it can at least be used by low-level code.
The only thing that I'm not too happy about is that Sync
is no longer required for #[thread_local]
: this makes it more difficult to write signal-safe code, since all communication between signal handlers and the main thread should be done through atomic types.
@Amanieu Signal/interrupt-safe code has other requirements which are not satisfied by current Rust.
@eddyb Not really, all you need to do is treat the signal handler as a separate thread of execution. The only requirement you need to enforce safety is that objects which are accessed by both the main thread and the signal handler must be Sync
.
Of course you can still cause deadlocks if the main thread and signal handler both try to grab the same lock, but that's not a safety issue.
Copying @gnzlbg's comment from rust-lang/libc#1432
It appears that 1)
thread_local!
solves the problem for most people, and 2) the main uncertainty is that#[thread_local]
isn't portable.Maybe we could reduce the scope of an initial version of
#[thread_local]
toextern static
declarations. If the target does not support#[thread_local]
, well then theextern static
declaration is incorrect since there cannot be a definition anywhere, and using it would already be UB. We could add on top of this a "best effort" compilation error, e.g., if the compiler knows that the target doesn't support it.
I agree with the above, thread_local!
and #[thread_local]
for extern static
unblocks virtually all use-cases, and would allow interfaces to errno
using stable Rust on all platforms (see the linked libc
issue for more context).
Right now there is no way (with stable Rust) to use C thread local variables via FFI without writing additional C glue code, see the errno-dragonfly
crate for an example of such a (necessary) hack.
cc @joshtriplett: allowing #[thread_local]
on extern static
s might be something for the agenda of the WG-FFI, since interfacing with errno
is kind of an important part of the C FFI puzzle.
@gnzlbg Thanks! Agreed, we definitely need thread-local-variable support.
Independent of the FFI need for extern static
...
Like @Amanieu from last year, I'm using #[thread_local]
in a no_std
context (an RTOS, in my case), with the OS runtime handling management of the TLS pointer and memory. (They and I differ on one point, which is that I'm delighted that #[thread_local]
lifts the Sync
requirement for static
. It seems good and right.)
#[thread_local]
is currently the only unstable feature I have to rely on for program correctness. (I'm using a couple others for convenience, but I could lower them by hand if required. I cannot easily replicate the TLS link behavior by hand.)
I haven't dug into the compiler side of this, so this question may be naive, but is the has_elf_tls
LLVM target feature not sufficient to gate this? We have a few other language features (if not attributes per se) that are gated by target support -- for example, my platform doesn't have AtomicU64
. So it doesn't seem entirely without precedent.
@alexcrichton Any news on
#[thread_local]
becoming stabilized? AFAIK, at the moment it is impossible on DragonFly to accesserrno
variable from stable code, other than directly from libstd. This blocks crates likenix
on DragonFly, which want to access errno as well, but libstd is not exposing it, and stable code is not allowed to use feature(thread_local).
We now provide __errno_location() since this commit:
https://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/60d311380ff2bf02a87700a0f3e6eb53e6034920
The original issue suggests that the thread_local
attribute in LLVM isn't supported across all platforms, but has platform support improved in the time since this issue was opened? Or do we expect that it will always remain nonportable? (Is there an LLVM tracking issue?)
Hello, #[thread_local]
can currently be applied to fields of a struct without any errors or warnings (it won't work oc):
#![feature(thread_local)]
use std::sync::{Arc, Mutex};
#[derive(Debug)]
struct Foo {
#[thread_local]
bar: &'static str,
}
fn main() {
let foo = Arc::new(Mutex::new(Foo {
bar: "bar",
}));
dbg!(foo.lock().unwrap().bar);
let foo2 = foo.clone();
std::thread::spawn(move || {
foo2.lock().unwrap().bar = "baz";
}).join().unwrap();
dbg!(foo.lock().unwrap().bar);
}
Outputs:
[src/main.rs:16] foo.lock().unwrap().bar = "bar"
[src/main.rs:23] foo.lock().unwrap().bar = "baz"
By looking at the issue description, #![feature(thread_local)]
seems unsound, but it's not caught by the incomplete_features
lint:
https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=0d95cf5246b5b41ddc6776527fdc89e5
Hm, that's fair... OTOH if we made incomplete_features
fire we'd have to enable that lint in library/std, so we'd not easily notice when another unsound feature is enabled.
IMO it would be better to just make all accesses to these statics unsafe.
@RalfJung Well, I just read #71435 (comment) after commenting and am wondering whether I should edit and hide my comment or not.
Well, I just read #71435 (comment) after commenting and am wondering whether I should edit and hide my comment or not.
Hm, I do not see the relation to be honest (also I was wrong when I posted that -- there is an implementation strategy).
#54366 sounds like it might be a soundness issue if it also happens for #[thread_local] static
(without mut
).
@RalfJung I think I was confused by confusing comments (not only yours) and probably misread some of them. Sorry. Ignore what I said before.
Hm, that's fair... OTOH if we made
incomplete_features
fire we'd have to enable that lint in library/std, so we'd not easily notice when another unsound feature is enabled.IMO it would be better to just make all accesses to these statics unsafe.
core
, alloc
and stdarch
have already used #![allow(incomplete_features)]
. Is std
special or #![allow(incomplete_features)]
s in them are planned to be removed?
Also, I think adding an attribute that suppresses incomplete_features
only for specified features could solve the noticing issue generally (also in core
, alloc
and stdarch
).
core, alloc and stdarch have already used #![allow(incomplete_features)]
I'd hope there's work happening to get rid of that. :/ I thought since the move to min_specialization, things were better. I guess this is mostly due to const-generics.
At the very least there should be comments explaining why unsound features are enabled in the very foundation of Rust... but it seems not everyone agrees on sch a policy, seeing that this code was reviewed and landed. Oh well.
Also, I think adding an attribute that suppresses incomplete_features only for specified features could solve the noticing issue generally (also in core, alloc and stdarch).
That would help, yes.
With relocation-model=static
compiler uses fs "segment" to access thread local variable like expected, but with pic
it uses __tls_get_addr()
Shouldn't it use fs in both situation?
// relocation-model=static
example::f:
mov byte ptr fs:[example::FOO@TPOFF+3], 1
mov qword ptr fs:[example::BAR@TPOFF], 2
ret
// relocation-model=pic
example::f:
sub rsp, 8
lea rdi, [rip + example::FOO@TLSLD]
call __tls_get_addr@PLT
mov byte ptr [rax + example::FOO@DTPOFF+3], 1
mov qword ptr [rax + example::BAR@DTPOFF], 2
pop rax
ret
As far as use cases go. All the bare-metal Arm targets seem to support #[thread_local]
definitions. Not using #[thread_local]
definitions leads to a lot of trouble because it appears there's no easy way, outside of compiler code, to figure out a type's memory layout in time to tell the linker what to do. It's possible to fully emulate TLS, of course, but that could be a lot of overhead for a bare-metal target.
I don't have a problem, personally, sticking to Nightly for this, though.
Edit: I would worry more about these soundness issues in my project, but accessing static muts is already unsafe, so there's a built-in caveat emptor. (Project docs: "Safety: Don't let this reference escape the thread via...")
@Soveu the POSIX/ELF AMD64 ABI has 4 models for accessing Thread-Local Storage. See this doc on Thread-Local Storage models for more information.
- Global Dynamic (Section 4.1.6)
- Uses a call to
__tls_get_addr()
for each global variable access - Used for
extern
/pub
variables if Rust doesn't know if the code will be statically linked
- Uses a call to
- Local Dynamic (Section 4.2.6)
- Uses a call to
__tls_get_addr()
to get the current TLS offset, then uses relative offsets for subsequent accesses - Used for private variables if Rust doesn't know if the code will be statically linked
- Uses a call to
- Initial Exec (Section 4.3.6)
- Uses
fs
with the GOT and a@GOTTPOFF
offset - Used for
extern
/pub
variables if Rust knows if the code will be statically linked
- Uses
- Local Exec (Section 4.4.6)
- Uses
fs
without the GOT and a@TPOFF
offset - Used for private variables if Rust knows if the library code be statically linked
- Uses
Depending on how you end up linking the binary (and if you use LTO) Rust/LLVM might use one of the "Exec" models even if your relocation model is pic
. Using the static
relocation model just tells LLVM that the code will definitely be statically linked.
This example playground should be able to show all 4 models (if you switch pic
to static
)
There's also the -Ztls-model
Nightly rustc flag, if you want to force a model to be used: https://doc.rust-lang.org/unstable-book/compiler-flags/tls-model.html.
Interesting, I thought TLS is basically a thread-local copy of .tdata
and fs
holds an offset between .tdata
and the copy
btw, is there a way to tell rustc to use gs
for thread locals instead of fs
?
btw, is there a way to tell rustc to use
gs
for thread locals instead offs
?
I would also really like this (for Kernels and the like), but I don't think LLVM supports this (as it's not one of the 4 TLS models described above). You would have to get a 5th TLS model (maybe called kernel
) added to LLVM to support this.
I heard some rumors that Redox does that, but I can't find how
I heard some rumors that Redox does that, but I can't find how
Looks like they modified their binutils: https://gitlab.redox-os.org/redox-os/binutils-gdb/-/merge_requests/5
I heard some rumors that Redox does that, but I can't find how
Looks like they modified their binutils: https://gitlab.redox-os.org/redox-os/binutils-gdb/-/merge_requests/5
Would have been better if they made this pull request to actual binutils then we all could use it.
Okay, I think I found a way to make things work using just #[thread_local]
declarations, instead of full definitions. Only requires some form of asm
support. playground
The trick is figuring out how to link the extern declaration to a full definition without LLVM noticing and ignoring the original declaration. The .equiv
asm directive seems to accomplish this with minimal voodoo magic.
Would it be possible to partially stabilize this for let's say x86_64 architecture in user space code and by also not supporting problematic types like generators etc?
Marking as impl-incomplete since it may still have soundness issues, and may misbehave on platforms without TLS support.
EDIT: original proposal here was unsound wrt scoped threads, see #29594 (comment)
(click to open the original proposal)
I'm not sure where this should go, and I don't have the time for a complete RFC, but "a lifetime shorter than 'static
" came up recently in the context of some rustc
internal data structures (which are owned thread-locally, not coincidentally), so:
(with apologies to any previous suggestions similar to this, that may have been dismissed in the past - I couldn't find anything in this very issue, at least)
If we added a 'thread
lifetime, like 'static
but strictly shorter:
- borrowing a
#[thread_local]
(and eventhread_local!
) could produce a&'thread T
reference- it should even be returnable from functions (as long as it doesn't leave the thread, see below)
- to preserve lifetime parametericity,
&'thread T: Send + Sync
has to be true ifT: Sync
(just like with&'a T
for any other'a
), and soundness of'thread
usage shouldn't require it- in fact, it should be entirely possible to pass such a
&'thread T
"down" to a scoped thread, just like howf(&THREAD_LOCAL)
(with#[thread_local]
) orTHREAD_LOCAL.with(|data| f(data))
(withthread_local!
) work today
- in fact, it should be entirely possible to pass such a
- everything that requires
T: 'static
today would still not allow anything containing these new'thread
lifetimes, and that includes:- potentially-detachable (i.e. non-scoped) threads
- any kind of communication between non-scoped threads is also indirectly affected by that bound, i.e.
mpsc::Receiver<T>: 'static
requiresT: 'static
- this also extends to thread pools running
async
tasks: aFuture
fromasync fn
, that is required to be'static
, cannot keep a&'thread T
across a suspension point
- any kind of communication between non-scoped threads is also indirectly affected by that bound, i.e.
thread_local!
's data type- essentially stating "no
&'thread T
can get from the normal thread execution, intothread_local!
destructor execution" - a
thread_local!
destructor could itself obtain a&'thread T
reference, but it would be limited in scope to that destructor's execution, since there would be no place available to stash it - for this to be sound, I believe
#[thread_local]
also needs the'static
bound, otherwise it could hold a reference to athread_local!
, that a differentthread_local!
's destructor could read back
- essentially stating "no
- potentially-detachable (i.e. non-scoped) threads
- there are two styles for "getting short-lived references" APIs today:
fn with(&self, f: impl FnOnce(&T))
- only option sound today for thread-local
T
s,thread_local!
uses it
- only option sound today for thread-local
fn get(&self) -> &T
(Deref
-like)- with owned
self
, this is unsound becauseBox::leak(Box::new(self)).get()
returns a&'static T
, even ifself
itself is!Send
/!Sync
- this is (incorrectly) used in a few places in
rustc
, today, and that's where this whole idea came from
- with owned
- but with
'thread
, we could have the best of both worlds, and return&'thread T
, to be more flexible than today's sound option (with
) while (hopefully?) remaining sound- only downside is
Deref
can't be implemented directly (except on a type that has&'thread T
as a field itself)
- only downside is
- a
struct
withPhantomData<&'thread ()>
would make that type never pass a'static
bound check, so evenBox::leak(Box::new(self))
would only produce a&'thread Self
, not a&'static Self
, meaning aDeref
-like API would actually remain sound- though this sort of thing could complicate the implementation, unless it's only allowed by making the
struct
lifetime-generic and passing'thread
in that way
- though this sort of thing could complicate the implementation, unless it's only allowed by making the
Based on the little I remember about how 'static
is implemented, there's a good chance of this being easy to fully implement (copy how 'static
is handled, but make it strictly "shorter" in the lattice), but I wouldn't bet on it just yet.
EDIT:
- while looking around for precedents, I found a full-fledged RFC (rust-lang/rfcs#1705) from 2016, that I had completely forgot about - looking in it now to compare it...
- oh, it made the classic mistake of proposing
&'thread T: !Send
(which is impossible because of lifetime parametericity, see above) instead of relying only on'thread < 'static
:Any type depending on
'thread
(i.e., a type product of the type construction from'thread
) is!Send
, and thus bounded to the current thread. - @nikomatsakis caught onto that issue in this comment: rust-lang/rfcs#1705 (comment)
- overall I feel like that RFC was not arguing for itself very well - the
fn
-scoped#[thread_local]
limitation we ended up with instead is enough for soundness, and there is no talk about letting theLocalKey
API ofthread_local!
use the'thread
lifetime etc.
@eddyb
I really hope for this feature to be stabilized as soon as possible, but I must ask the following question:
it should be entirely possible to pass such a &'thread T "down" to a scoped thread
Is it guaranteed that the same thread-local pointer stays valid when passed to a different thread? I can imagine a system which maps the same numeric pointer to different physical addresses for different threads via MMU magic.
Is it guaranteed that the same thread-local pointer stays valid when passed to a different thread? I can imagine a system which has the same numeric pointer to point to different physical addresses for different threads via MMU magic.
That is not allowed. Thread-local variables for different threads must have distinct addresses.
Is it guaranteed that the same thread-local pointer stays valid when passed to a different thread? I can imagine a system which has the same numeric pointer to point to different physical addresses for different threads via MMU magic.
That is not allowed. Thread-local variables for different threads must have distinct addresses.
@newpavlov To expand a bit further: such a pointer could not, in Rust, use the type &T
, and instead would have to be some kind of wrapper that only produces a &T
if it can compute a "global" pointer (i.e. one valid across all threads).
If T: Sync
, then &T: Send
holds and the pointer can make its way into a different thread, as long as it's not accessed outside of its original scope. Even with a !Sync
pointee, I'd still be wary of any further derived reference existing (&Foo
doesn't give Foo
as much control over the reference as a separate FooRef
type would).
This applies to thread_local!
's .with(|short_lived_ref| ...)
method today, which doesn't stop you in any way from e.g. spawning some scoped threads and capturing short_lived_ref
in them, inside the closure.
(And #[thread_local] static
s have their equivalent, where do_anything_with(&THREAD_LOCAL_STATIC)
passes the reference "down the stack" to a function that doesn't really see it as any other reference).
Reading back my sketch for for 'thread
above (#29594 (comment)), and specifically:
in fact, it should be entirely possible to pass such a
&'thread T
"down" to a scoped thread,
I had some thoughts about interesting lattice interpretations of 'thread
, which is that it's somewhere between 'static
and all other stack-related lifetimes within a thread.
With detached threads requiring 'static
bounds to pass data between them, each detached thread ends up with a hierarchy of 'static > 'threadX > ...
(for some thread X), so 'thread
can be seen as a kind of combination (union/intersection aka join/meet, whichever is correct) of all 'threadX
.
But that only works for detached threads - for scoped threads you end up with 'threadX
lifetimes that can be arbitrarily small, which means 'thread
actually becomes isomorphic "the empty lifetime" (i.e. it has to be treated as shorter than any other lifetime, kind of like a "bottom" equivalent) and it's impossible to do almost anything with it, if you're to remain sound.
In fewer words, my original proposal was unsound as stated.
As an example, imagine &'a Cell<&'thread T>
to a scoped thread - if it can place its own TLS references in there, those will become invalid when the scoped thread exits.
But if we take the correct interpretation mentioned earlier, that type is illegal, because 'thread
is shorter than any lifetime (including 'a
) - problem averted, right?
Well but now you can never have let x = &THREAD_LOCAL;
because x
has a scope that's arguably longer than 'thread
(it's shorter than the current thread, but with just one 'thread
there's no way to distinguish).
So you're either useful and unsound wrt scoped threads, or sound but useless.
There might be a way to salvage the hope of a 'thread
that doesn't need to interfere with Sync
/Send
at all, which I didn't come up with myself but was suggested to me by @eternaleye:
'thread
could become an implicit extra lifetime parameter to all functions. Given @tmandry and @nikomatsakis' discussions around varieties of contextual implicits, it could be seen as a compiler-implied with 'thread
.
To limit compilation performance impact and whatnot, it would ideally only be nameable anywhere in the function if it shows up in the signature or where
clauses, and calling a function that needs 'thread
from one that doesn't, could just use the outermost scope of the caller (effectively "statically known top of the thread", which is appropriate, given that the caller could literally be a thread entry-point for all we know).
For detached threads, we have the same solution: 'thread
won't pass the 'static
bounds required by e.g. thread::spawn
, unless you have a 'thread: 'static
bound on the caller - which could either never be satisfiable, or, as an interesting twist, could perhaps be satisfiable inside fn main
specifically, encoding in the language that "the main thread lives forever" (and yes this would also mean &THREAD_LOCAL
from within fn main
would be &'static
- could have interesting implications).
For scoped threads, it boils down to "only direct calls to functions pass 'thread
" - the amount of dynamism (whether fn
pointers or dyn FnOnce
) involved forces the new thread's entry-point to effectively be 'thread
-polymorphic, and anything using 'thread
from the parent thread looks like it could be any stack lifetime.
The earlier example of &'a Cell<&'thread T>
would just be a &'a Cell<&'b T>
, and &'b T
could only be created by the scoped thread using any other &'b
references it may have gotten from its parent thread.
In fact, without some with 'thread
-like abstraction at the trait impl
level, even calling a trait method should probably leave the callee 'thread
-polymorphic for now.
So writing 'thread
anywhere other than in the signature/where
clauses of a free function or inherent impl, should be an error (i.e. a type or trait definition should take an explicit lifetime parameter and not refer to 'thread
).
A limited version of this feature could be implemented right away, and because it's mostly just desugaring to lifetime parameters (with only the choice of what's passed for the parameter in direct calls, and the lifetime in &THREAD_LOCAL
's type, being 'thread
-specific semantics), it's way more likely to be sound.
#[thread_local]
causes an Internal Compiler Error if used with proc macro-specific types (like proc_macro::{Delimiter, Group, Ident, Punct, Spacing, Span, TokenStream, TokenTree}
). Could we either
- document that it's not for proc macro-specific types, or
- have a tracking issue for this, please?
segment fault on windows. is it a misuse of the feature, or a bug?
Would it makes sense to just reject the attribute on targets where target_thread_local
is not set?
#[thread_local] causes an Internal Compiler Error if used with proc macro-specific types (like proc_macro::{Delimiter, Group, Ident, Punct, Spacing, Span, TokenStream, TokenTree}). Could we either
This has nothing to do with this tracking issue; it's about the entire proc-macro system being incompatible with thread-local state (no matter how that state got implemented).
This tracking issue is about thread-local state specifically implemented via the native mechanism of the linker (as opposed to something like pthread keys). The thread_local!
macro has different implementations depending on the target; sometimes it uses linker-native thread-locals (internally using the feature tracked here), sometimes it uses slower but less fragile run-time OS-provided mechanisms.
I wonder if there's a path towards a minimal thread_local
stabilization by being very restrictive?
E.g.:
- all access is by value only
- only allow
Copy
types (or maybe even restrict it to only a subset of primitives) - don't allow lifetimes
- don't allow composing
thread_local
with other lang attributes (they may interact in "interesting" ways) - make
thread_local
error if the platform does not support it. This would imply also stabilizingtarget_thread_local
but maybe the name should be subject to bikeshedding first (e.g.has_static_thread_local
) to be clear that it may still have runtime thread locals.
What is the motivation for that? Doesn't thread_local! { static NAME = const { ... } }
suffice?
no_std
mainly
Does that form of the macro need anything that requires std
? Could there be a core::thread_local! { ... }
macro that makes that part of the functionality available without std
(i.e., it would require const
blocks)?
We could make a macro I guess. It'd be pretty redundant though when/if thread_local
proper is stabilized. Unless we decide to just have a macro and not the attribute.
The API provided by #[thread_local] static
and const-thread_local is pretty different I think. So your proposal does introduce redundancy that we currently don't have. There'd be two stable ways to do the same thing and two completely different mechanisms to ensure the necessary restrictions (such as "no 'static
references").
The "two ways to do things" will occur whatever happens unless we simply don't stabilize #[thread_local] static
.
That is true. If the usecases are covered I don't see a reason to stabilize #[thread_local] static
. It could be transitioned to an internal feature, an implementation detail of the public macros.
core::thread_local!
would also need LocalKey
moved into core::thread
(or an equivalent to it). One thing that thread_local!
forces that #[thread_local]
doesn't is going through a shared reference. There would also still need to be a way to know if thread_local
is supported in no_std
.
What is the motivation for that? Doesn't thread_local! { static NAME = const { ... } } suffice?
IIRC generated assembly for thread_local!
was quite ugly when compared to equivalent #[thread_local]
code. I haven't measured performance impact and do not know if it's possible to work around it with std
changes, but it's still an example of non-zero-costness. Also, #[thread_local]
-based code simply looks nicer and more ergonomic.
For the new thread_local! { static FOO: ... = const { ... }
syntax it should all be optimized down to a minimum. If not then we should really try to fix that.
EDIT: assuming the type of FOO
has no drop, of course.
Also, #[thread_local]-based code simply looks nicer and more ergonomic.
If they only work for Copy
types and you can't take any references (and hence also not call any &self
/&mut self
methods), I am not sure if that's still true.
And taking references would be unsound.
And taking references would be unsound.
Borrowck limits those references to the current function already I believe.
Yes. The goal would be to eventually make full #[thread_local]
stabilized (once bugs, etc are fixed). But that's not happening any time soon and in the meantime a minimal stabilization would be useful.
Oh interesting, I wasn't aware that borrowck had special treatment for thread_local statics.
I'd be very interested in seeing this stablized.
I use #[thread_local]
in the interface code for my WIP OS, as an import/export for thread_local statics (https://github.com/LiliumOS/lilium-sys/blob/main/src/sys/io.rs#L44-L54). The handles here are thread-local (as handles themselves are thread-local resources in the OS) that are initialized by the USI (userspace standard interface) when you create a thread (This specific case is limited to a Copy
type and you probably won't be be borrowing the handles often at all).
In general, being able to import an external TLS var without shim code written in C is useful (for example, if you want to grab __errno
). This cannot be done with LocalKey
and std::thread_local!
.
#[thread_local]
would also enable using TLS in a no-std context, such as a kernel.
If they only work for Copy types and you can't take any references (and hence also not call any &self/&mut self methods), I am not sure if that's still true.
How about introducing a pointer-based variant of #[thread_local]
? Something like this:
// Creates TLS value which stores 42 and creates "pointer" `FOO` to it.
#[thread_local_ptr]
static FOO: *mut u64 = 42u64;
fn increment_foo() {
// SAFETY: reference to `FOO`does not escape execution thread
let foo: &mut u64 = unsafe { &mut *FOO };
*foo += 1;
}
@newpavlov it seems someone already taught the borrow checker about thread_local so taking references to them is actually sound. That's pretty cool.
I don't like there being this API duplication and inconsistency between that and the thread_local!
macro, but that's up to libs-api to figure out.
We'd have to be rather careful where we enable this feature; in the past, I think on some Windows targets we switched back-and-forth between "true" thread-local statics and some other implementation for the thread_local!
macro. The macro gives us the flexibility to do that; #[thread_local]
does not, so once we allow it somewhere, if we later figure out there is some platform issue then we are in trouble.
I'm slightly bothered by thread-local statics pretending to be statics. They have very little in common with regular static
in terms of their semantics. &FOO
isn't even a constant, it is a function. We do reflect this in the MIR at least so bugs due to this are hopefully unlikely. And I don't have a proposal for a better syntax either so ๐คท
I have a few use cases for #[thread_local]
that can't be satisfied by the standard library's thread_local!
macro:
- The code base is a
no_std
binary which doesn't depend on libc (it has its own TLS/stack initialization code). - It exports
#[thread_local]
variables for use by C code (errno
). - It used to
#[thread_local]
variables in inline assembly (by symbol name), although that code has since been refactored and it not longer does that. It is possible to know the exact instruction sequence to use with the symbol because the whole code base is compiled with-Z tls-model=local-exec
.
#[thread_local]
translates directly to thethread_local
attribute in LLVM. This isn't supported on all platforms, and it's not even supported on all distributions within the same platform (e.g. 10.6 doesn't support it but 10.7 does). I don't think this is necessarily a blocker, but I also don't think we have many attributes and such which are so platform specific like this.
What's the status of this? I see a lot of discussion around a hypothetical cfg(target_thread_local)
, but nothing concrete?
It's not hypothetical, cfg(target_thread_local)
exists on nightly. However, historically we have turned this on and off on some platforms to work around various bugs, so we should be careful before just blanket-exposing this on stable.
It used to #[thread_local] variables in inline assembly (by symbol name), although that code has since been refactored and it not longer does that. It is possible to know the exact instruction sequence to use with the symbol because the whole code base is compiled with -Z tls-model=local-exec.
If we just allow asm blocks to reference thread-locals, I worry it will lead to much confusion and errors. A thread-local is not just a normal symbol, after all -- but a Rust programmer might think it is, since in Rust code it behaves much like a static, but that's a sweet lie.
Unfortunately the inline asm docs don't even give an example for how sym
is used at all. They do mention this though:
is allowed to point to a #[thread_local] static, in which case the asm code can combine the symbol with relocations (e.g. @plt, @tpoff) to read from thread-local data.
Presumably, it is UB to try to access that symbol without the exactly right set of relocations matching the current target and build flags? Seems like a pretty big footgun.
Presumably, it is UB to try to access that symbol without the exactly right set of relocations matching the current target and build flags? Seems like a pretty big footgun.
It's always safe to use the most general relocations (which involve calling __tls_get_addr
). It's the more specific ones like @tpoff
which are only valid with certain TLS models (for example local-exec
is only valid in executables, not shared libraries).
Using __tls_get_addr
requires you to emit a very specific set of bytes and relocations (on x86 this includes redundant prefixes). Getting anything wrong will likely cause tls relaxation by the linker to either error or generate a corrupt binary. And even with the most general relocations you still have to deal with different object file formats using different ways of accessing TLS. We currently don't have any cfg's for the object file format, so doing something that works correct on any OS for a given architecture is not possible.
Just curious, are there any thread-local modes that aren't based on ELF's thread structure system? e.g. on x86_64 will we eventually be able to simply emit instructions that use FS/GS segments directly instead of reading a pointer from a negative offset and de-referencing it?
As far as I can tell that's GNU's/ELF's TLS model that doesn't really translate nicely to #![no_std]
code (sans-OS) on x86_64.