How do the thread local variables in the Rust standard library work?
Asked Answered
M

1

10

How do the thread local variables in the Rust standard library work? I looked at the code, but got lost in indirection. It seems that there are different configurations for thread local storage, an OS dependent mode and a fast mode. Which one is the default, and how do I choose which one to use? In particular, what are the implications of using thread local storage in a crate for the user of the crate?

Using thread local storage is simple enough, and the generated assembly looks really efficient, but I can not use the feature in a library without fully understanding the implications.

I've also asked:

Masturbation answered 19/11, 2019 at 20:15 Comment(8)
Did you click through to the suggested reading of LocalKey? It states: This key uses the fastest possible implementation available to it for the target platform and Initialization is dynamically performed on the first call to with within a thread, and values that implement Drop get destructed when a thread exits which appear to answer both of your questions.Akira
Yes, of course I read that. That tells me what it does and that it tries to be fast, but not how it works. E.g. it does not help me answer the question how many bytes per thread it will cost library users if I use a thread local in a library...Celestyn
The current close vote is "Too broad — Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer". You have indeed asked a very broad question ("how is something implemented") and have actually asked multiple questions (+ "which is default" + "how do I choose" + "what are the implications" + "how much space does it take"). Any answer that covered these topics to a thorough degree would be a chapter in a book, which is a pretty good sign that the question is overly broad.Akira
No idea about the title though; the only similar thing I've seen personally is a warning that specific titles have historically been downvoted for being low-quality. As you've seen, I was able to edit it (and I bet you could have too).Akira
If I ask a concrete question it probably will be closed because it can not be simply answered. But I will try it anyway. But in another question... Feel free to close this one...Celestyn
Of the ones I've identified, "what is the space overhead of using a thread local" seems reasonable and answerable. However, because there are platform-specific details, it may indeed be very hard to answer, but should be on-topic. It would probably be good if you identified a base set of platforms you were interested in (e.g. 64-bit Mac/Linux/Windows).Akira
Let us continue this discussion in chat.Celestyn
As far as i remember, rust threadlocals based on pthreads (for linux) library. So, the overhead same as pthreads. See manual for pthread_key_creatre() and others.Dodecanese
C
15

The different configurations you see are related to #[thread_local], a feature that is intended to replace the thread_local! macro to make the user code more straightforward, e.g.:

#![feature(thread_local)]

#[thread_local]
pub static mut VAR: u64 = 42;

However, at the moment of writing, this feature is not fully implemented yet (you can find the tracking issue here). It is used internally in the compiler code though, and that is the magic you see in the actual 'fast' implementation in std::thread::LocalKey:

#[thread_local]
#[cfg(all(
    target_thread_local,
    not(all(target_arch = "wasm32", not(target_feature = "atomics"))),
))]
static __KEY: $crate::thread::__FastLocalKeyInner<$t> =
    $crate::thread::__FastLocalKeyInner::new();

Notice the #[thread_local] attribute at the top. It is then translated down to LLVM IR, so the actual implementation of TLS (thread-local storage) is carried by LLVM and implements the ELF TLS models. This is the default configuration.

how do I choose which one to use?

You'll need to compile your own rustc version with the target_thread_local feature omitted. In that case, the os variant of std::thread::LocalKey will be used, and then, depending on a platform, it can use pthreads (Unix), or Windows API, or something else.

WebAssembly is a special case: as it doesn't support threads, TLS will be translated down to simple static variables.

Checked answered 8/5, 2020 at 14:11 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.