Capturing a `thread_local` in a lambda
Asked Answered
U

2

16

Capturing a thread_local in a lambda:

#include <iostream>
#include <thread>
#include <string>

struct Person
{
    std::string name;
};

int main()
{
    thread_local Person user{"mike"};
    Person& referenceToUser = user;

    // Works fine - Prints "Hello mike"
    std::thread([&]() {std::cout << "Hello " << referenceToUser.name << std::endl;}).join();

    // Doesn't work - Prints "Hello"
    std::thread([&]() {std::cout << "Hello " << user.name << std::endl;}).join();

    // Works fine - Prints "Hello mike"
    std::thread([&user=user]() {std::cout << "Hello " << user.name << std::endl;}).join();
}

https://godbolt.org/z/zeocG5ohb

It seems like if I use the original name of a thread_local then its value on the thread which executes the lambda is the thread_local version of the thread which is running the lambda. But as soon as I take a reference or pointer to the thread local it turns into (a pointer to) the originating threads instance.

What are the rules here. Can I rely on this analysis?

Urga answered 13/6, 2023 at 12:39 Comment(5)
Please explain what "Works fine" and "Doesn't work" mean so people don't have to click through and figure out what you could mean.Germann
From my reading of cppreference, this just isn't allowed: A lambda expression can use a variable without capturing it if the variable is a non-local variable or has static or thread local storage duration (in which case the variable cannot be captured), or source: en.cppreference.com/w/cpp/language/lambdaDisenchant
@NathanOliver-IsonStrike I think it's ok as in example 2 I'm not capturing anything at all (you can remove the & in the [&]. Example 1 and 3 I'm capturing something which isn't the thread_local. Example 2 still has an issue as discussed in the accepted answer though.Urga
@MikeVine What I was getting at is that 2 doesn't capture user even though you specified &. Instead you have a new thread local variable in your lambda that has never been initializedDisenchant
@NathanOliver-IsonStrike: "new thread local variable in your lambda". Well not exactly. A thread local variable with a new instance in the std::thread, which the lambda finds when it runs inside the std::thread. But it isn't "in the lambda", as that same exact lambda could be executed on the main thread and find the old thread local instance.Lamoree
B
16

Similar to local static objects, local thread_local (implicitly static thread_local) objects are initialized when control passes through their declaration for the first time.

The thread you are creating never executes main, only the main thread does, so you're accessing user before its lifetime has begun on the extra thread.

Your three cases explained

std::thread([&]() {std::cout << "Hello " << referenceToUser.name << std::endl;}).join();

We are capturing referenceToUser which refers to the user on the main thread. This is okay.

std::thread([&]() {std::cout << "Hello " << user.name << std::endl;}).join();

We are accessing user on the extra thread before its lifetime has begun. This is undefined behavior.

std::thread([&user=user]() {std::cout << "Hello " << user.name << std::endl;}).join();

Once again, we are referencing the user from the main thread here, which is same as the first case.

Possible fix

If you declare user outside of main, then user will be initialized when your thread starts, not when main runs:

thread_local Person user{"mike"};

int main() {
    // ...

Alternatively, declare user inside of your lambda expression.


Note: It's not necessary to capture thread_local objects. The second example could be [] { ... }.

Burly answered 13/6, 2023 at 12:50 Comment(8)
I guess this is a choice the designers of C++ made - whether to (for want of a better phrase) "capture thread_locals by name instead of by address". I guess statics & globals & function-names, etc are all captured by name but for those it really doesn't matter as they don't change across threads. For thread_locals it does matter the choice and it seems like they're treated as statics instead of locals.Urga
I guess not "capturing by name" effectively means they don't need to be stored at all so this lambda in example 2 is stateless. I think I get it now. Not that I think this is the right choice, but I guess the user can use the method I use in example 1 or 3 to change this to the other option so its a better choice, even if slightly confusing at first look.Urga
@MikeVine if you used the method from example 1 or 3, then there would be no point to using thread_local, since all threads are using the same object. Also, I think this behavior for static thread_local is a massive footgun in C++, and very bad design. You don't even get a warning and I couldn't get any sanitizer to catch this mistake.Burly
I don't think this is necessarily not useful - I can think of times where I want to pass in the thread_local by "address" and let other threads use it. Imagine a logger which wrote to the log file on another thread and I wanted that logger to reference this threads instance of a thread_local. That could (and would) work via example 1 or 3. It would be niche tho.Urga
Why not just capture the tread_id by value? Is there even a guarantee that the tread that started the logging still "exists" by the time the logging is processed on the background thread. I've done a lot of multithreading programming (about 25 years) and I only really needed thread_local only once or twice (so it is not the first solution I look for).Detrude
Here is attempt to force manifestation of UB: godbolt.org/z/zvMacxWcaRamtil
I think there's an aspect this is ignoring -- see godbolt.org/z/eTofxeYz6. There's no UB here, but you still get the same pattern of different behaviour. The behaviour is correct and expected in this case, I think; 1 & 3 are capturing the value from the main thread while 2 is independently accessing the value from the new thread.Bourguiba
@Bourguiba yes, this behavior is correct and expected. If you look at my answer, specifically the "Your three cases explained" section, you will confirm that the output is different because the main thread's user object is used.Burly
R
1

[&] is a very dangerous thing to do when your lambda executes outside of the immediate context. It is very useful there -- otherwise, it is a bad plan.

In this case, you are being bitten by the fact that [&] does not capture global, local static or thread local variables. So using [] in this case would behave the same.

thread_local Person user{"mike"};
Person& referenceToUser = user;

std::thread([&]() {std::cout << "Hello " << referenceToUser.name << std::endl;}).join();

this captures referenceToUser by reference. referenceToUser in turn refers to the thread_local variable in the main thread.

std::thread([&]() {std::cout << "Hello " << user.name << std::endl;}).join();

this is identical to

std::thread([]() {std::cout << "Hello " << user.name << std::endl;}).join();

the use of [&] here makes you believe it is capturing user by reference. So the thread_local variable main::user is being used. As the thread never passed the initialization line of that variable, you have just done UB.

std::thread([&user=user]() {std::cout << "Hello " << user.name << std::endl;}).join();

here you explicitly create a new reference variable user at lambda creation.

The basic rule is **never use [&] when creating a lambda to pass to a std::thread.

This is appropriate use of [&]:

foreach_chicken( [&](chicken& c){ /* process chicken */ } );

the lambda is expected to exist within the current scope, and is going to be executed locally. [&] is safe.

auto pop = [&]()->std::optional<int>{
  if (queue.empty()) return std::nullopt;
   auto x = std::move(queue.front());
   queue.pop_front();
   return x;
 };
 while (auto x = pop()) {
 }

this is another example of a valid use of [&], as this pop operation is being refactored into a helper and maybe run more than once in the local function.

But if the lambda is not being run locally or could live beyond the current scope, [&] is a toxic option that leads to surprises and bugs in pretty much every case I've seen it used.

Roband answered 14/6, 2023 at 17:19 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.