Should thread-safe class have a memory barrier at the end of its constructor?

Asked 10/8, 2016 at 19:11 Answered 1/1, 2021 at 17:2

c#.net multithreading parallel-processing memory-barriers

When implementing a class intended to be thread-safe, should I include a memory barrier at the end of its constructor, in order to ensure that any internal structures have completed being initialized before they can be accessed? Or is it the responsibility of the consumer to insert the memory barrier before making the instance available to other threads?

Simplified question:

Is there a race hazard in the code below that could give erroneous behaviour due to the lack of a memory barrier between the initialization and the access of the thread-safe class? Or should the thread-safe class itself protect against this?

ConcurrentQueue<int> queue = null;

Parallel.Invoke(
    () => queue = new ConcurrentQueue<int>(),
    () => queue?.Enqueue(5));

Note that it is acceptable for the program to enqueue nothing, as would happen if the second delegate executes before the first. (The null-conditional operator ?. protects against a NullReferenceException here.) However, it should not be acceptable for the program to throw an IndexOutOfRangeException, NullReferenceException, enqueue 5 multiple times, get stuck in an infinite loop, or do any of the other weird things caused by race hazards on internal structures.

Elaborated question:

Concretely, imagine that I were implementing a simple thread-safe wrapper for a queue. (I'm aware that .NET already provides ConcurrentQueue<T>; this is just an example.) I could write:

public class ThreadSafeQueue<T>
{
    private readonly Queue<T> _queue;

    public ThreadSafeQueue()
    {
        _queue = new Queue<T>();

        // Thread.MemoryBarrier(); // Is this line required?
    }

    public void Enqueue(T item)
    {
        lock (_queue)
        {
            _queue.Enqueue(item);
        }
    }

    public bool TryDequeue(out T item)
    {
        lock (_queue)
        {
            if (_queue.Count == 0)
            {
                item = default(T);
                return false;
            }

            item = _queue.Dequeue();
            return true;
        }
    }
}

This implementation is thread-safe, once initialized. However, if the initialization itself is raced by another consumer thread, then race hazards could arise, whereby the latter thread would access the instance before the internal Queue<T> has been initialized. As a contrived example:

ThreadSafeQueue<int> queue = null;

Parallel.For(0, 10000, i =>
{
    if (i == 0)
        queue = new ThreadSafeQueue<int>();
    else if (i % 2 == 0)
        queue?.Enqueue(i);
    else
    {
        int item = -1;
        if (queue?.TryDequeue(out item) == true)
            Console.WriteLine(item);
    }
});

It is acceptable for the code above to miss some numbers; however, without the memory barrier, it could also be getting a NullReferenceException (or some other weird result) due to the internal Queue<T> not having been initialized by the time that Enqueue or TryDequeue are called.

Is it the responsibility of the thread-safe class to include a memory barrier at the end of its constructor, or is it the consumer who should include a memory barrier between the class's instantiation and its visibility to other threads? What is the convention in the .NET Framework for classes marked as thread-safe?

Edit: This is an advanced threading topic, so I understand the confusion in some of the comments. An instance can appear as half-baked if accessed from other threads without proper synchronization. This topic is discussed extensively within the context of double-checked locking, which is broken under the ECMA CLI specification without the use of memory barriers (such as through volatile). Per Jon Skeet:

The Java memory model doesn't ensure that the constructor completes before the reference to the new object is assigned to instance. The Java memory model underwent a reworking for version 1.5, but double-check locking is still broken after this without a volatile variable (as in C#).

Without any memory barriers, it's broken in the ECMA CLI specification too. It's possible that under the .NET 2.0 memory model (which is stronger than the ECMA spec) it's safe, but I'd rather not rely on those stronger semantics, especially if there's any doubt as to the safety.

Cullender answered 10/8, 2016 at 19:11 Comment(18)

The source code for ConcurrentQueue<T> that you mentioned doesn't have any protection in its constructor. Make of that what you will. referencesource.microsoft.com/#mscorlib/system/Collections/… – Digamma 10/8, 2016 at 19:17

How about the consumer initializing using Lazy<T> which makes the initialization thread-safe ? :) – Jared 10/8, 2016 at 19:24

Barring a constructor that actually has async calls within it, can a reference even be set to reference an instance before the instance is constructed? – Olnee 10/8, 2016 at 19:25

@Olnee As can be observed from a single thread, no. As can be observed from multiple threads, yes. – Stocking 10/8, 2016 at 19:26

@BradleyUffner: Good point. That makes for a clearer question. – Cullender 10/8, 2016 at 19:26

How exactly the caller can make the instance accessible to other threads before the instance is constructed, since the caller itself does not have access to the instance yet?. The only way I see making instance accessible before being constructed is if the constructor calls some external code passing this. – Peccable 10/8, 2016 at 19:27

@user3185569: That would be a good solution if a race hazard exists. My question is whether the Lazy<T> is necessary, or whether the thread-safe class should provide that protection itself. – Cullender 10/8, 2016 at 19:27

I think this may only be a problem with static constructors (or constructors that share static state). Instance constructors generally seem safe. – Digamma 10/8, 2016 at 19:30

@Cullender my interpretation has always been that the resource is thread-safe but I don't have the resource until after it is initialized so it's the callers job to protect the initialization – Impearl 10/8, 2016 at 19:31

@IvanStoev In a single threaded context, yes, in a multithreaded context, you can observe the orders of operations to be different than the guarantees are for a single threaded program. Your CPU is allowed to reorder different writes to entirely different values that don't depend on each other. – Stocking 10/8, 2016 at 19:49

@Stocking I know that in general, but could you please provide a link or something that covers the constructors? All I see is It is explicitly not a requirement that a conforming implementation of the CLI guarantee that all state updates performed within a constructor be uniformly visible before the constructor completes. CIL generators can ensure this requirement themselves by inserting appropriate calls to the memory barrier or volatile write instructions. But this is speaking about before the constructor completes. – Peccable 10/8, 2016 at 19:57

@IvanStoev The specs state what is required; they don't provide a comprehensive list of all of the things that they aren't required. The specs aren't going to say that a constructor is allowed to return the object before the constructor has finished, rather the value a constructor returns isn't specifically on the list of operations that different threads are guaranteed to observe in a logically consistent order to a single threaded program. – Stocking 10/8, 2016 at 20:0

@Stocking so you are saying the assignment can occur after allocation, but before construction/initialization, even when all those occur on the same thread? – Olnee 10/8, 2016 at 20:1

@IvanStoev: Given the statement _queue = new Queue<T>();, the compiler / architecture / memory model may cause the memory location of the new object to appear to be assigned to _queue before the Queue<T>() constructor completes. – Cullender 10/8, 2016 at 20:1

@Olnee A single threaded program could not observe those actions happening out of order. Another thread observing the actions performed on another has radically fewer constraints on the observed ordering of operations. As it pertains to this example, a second thread can in fact observe an constructor call from another thread returning the instance before the constructor itself has finished running. The thread that called the constructor can't observe that unusual ordering, but any other thread can. – Stocking 10/8, 2016 at 20:5

@Cullender I upvoted your question because I find it interesting. But if a high level language like C# can't provide such simple guarantee, then I don't see what we are doing here. I quit programming :) – Peccable 10/8, 2016 at 20:10

@IvanStoev: You're far from alone in your frustration. The overwhelming consensus is that the majority of programmers should avoid low-level synchronization techniques, and just rely on lock or higher-level frameworks such as TPL. – Cullender 10/8, 2016 at 20:25

@Theodor Zoulias On x86, stores can't be reordered regarding each other, so if you see the result of the constructor then you're also seeing the value of _queue. On ARM64, all bets are off. I'm guessing the JIT compiler will insert a barrier to avoid breaking existing program, but unfortunately I don't know how to check (I don't have an ARM64 cpu, and sharplab only supports x86) – Lofty 3/1, 2021 at 10:13

Lazy<T> is a very good choice for Thread-Safe Initialization. I think it should be left to the consumer to provide that:

var queue = new Lazy<ThreadSafeQueue<int>>(() => new ThreadSafeQueue<int>());

Parallel.For(0, 10000, i =>
{

    else if (i % 2 == 0)
        queue.Value.Enqueue(i);
    else
    {
        int item = -1;
        if (queue.Value.TryDequeue(out item) == true)
            Console.WriteLine(item);
    }
});

Jared answered 10/8, 2016 at 19:27 Comment(5)

+1: This is what I currently do, given my uncertainty. But I would like to know whether the code should be expected to work without the thread synchronization from the consumer. In other words: If I were the one implementing the thread-safe class, should I be able to call my class "thread-safe" if consumers still needed to use Lazy<T>? – Cullender 10/8, 2016 at 19:29

I don't understand the edit. My original code would never have executed the initialization twice either. – Cullender 10/8, 2016 at 19:45

@Cullender Have you actually got null queue when running this code ? At least once ? – Jared 10/8, 2016 at 19:52

@user3185569 The code can never print null, but it can drop items on the floor if they're processed before the queue has been initialized. You're assuming that the 0th iteration completes before any of the other iterations start. There is no such guarentee. – Stocking 10/8, 2016 at 19:54

@user3185569: I might have chosen a bad example. It is acceptable for the code to drop some numbers. It is not acceptable for the code to throw NullReferenceException, IndexOutOfRangeException, print repeated numbers, get stuck in infinite loops, or do any of the other weird things caused by race hazards on internal structures. – Cullender 10/8, 2016 at 19:56

No, you don't need memory barrier in the constructor. Your assumption, even though demonstrating some creative thought - is wrong. No thread can get a half backed instance of queue. The new reference is "visible" to the other threads only when the initialization is done. Suppose thread_1 is the first thread to initialize queue - it goes through the ctor code, but queue's reference in the main stack is still null! only when thread_1 exists the constructor code it assigns the reference.

See comments below and OP elaborated question.

Docker answered 10/8, 2016 at 19:29 Comment(18)

Unfortunately, I think you're missing the intricacies of the ECMA CLI memory model. You can get a half-baked instance of queue visible to other threads. – Cullender 10/8, 2016 at 19:39

This answer is unfortunately basically wishful thinking. It assumes synchronisation when there is none. – Bight 10/8, 2016 at 19:43

@Cullender I have to admit I did not think about it. Yet, you don't find any evidence of memory barriers in System.Collections.Concurrent classes ctors which are by definition thread safe. Your topic extends thread safety definition to new regions. And that's pretty cool :) – Docker 10/8, 2016 at 20:0

System.Collections.Concurrent classes ctors which are by definition thread safe That's not true. They're not safe by definition. They're just implementations of types, like any other. They're much less likely to have mistakes than code you or I write, but it's not impossible for them to have mistakes; their code is not correct by definition, but rather there is a high degree of confidence in its correctness. – Stocking 10/8, 2016 at 20:2

It's always good to be reminded that there is much more you don't know than what you think you don't know :) – Docker 10/8, 2016 at 20:9

@Servy: I assume that shay__ meant that the System.Collections.Concurrent classes provide the de facto standard for what "thread-safe", as a term, should entail. Is there a formal definition / wider consensus for "thread-safe" which covers whether it should include initialization or not? – Cullender 10/8, 2016 at 20:9

@Cullender "Thread Safe" is a pretty meaningless term in general. Probably the best definition you'll manage to get is to say that a program that uses multiple threads does what it's supposed to do. You can't really give any more meaningful definition to the term, and as such, the term tends to be largely useless. If you have an object you plan to access from multiple threads you need to specifically define the operations you plan to perform, and the acceptable and unacceptable behaviors of those operations. – Stocking 10/8, 2016 at 20:13

@Servy: I agree that it's loosely defined. But it appears regularly on MSDN: "All public and protected members of ConcurrentQueue<T> are thread-safe and may be used concurrently from multiple threads." Given, as others have pointed out, that ConcurrentQueue<T> doesn't include a memory barrier at the end of its constructor, then either the documentation or the implementation is incorrect. – Cullender 10/8, 2016 at 20:17

@Cullender The fact that the documentation uses an ill-defined term means that the statement itself is simply meaningless. The assertion that, "All public and protected members of ConcurrentQueue<T> are thread-safe" doesn't actually tell you anything about the type. It's not incorrect, it's just not actually making any real assertion to begin with, right or wrong. It's like if it would have asserted that the type is super cool. That statement isn't wrong; it's not a statement that can either be considered objective correct or incorrect. – Stocking 10/8, 2016 at 20:20

@Servy: It's the "may be used concurrently from multiple threads" part that's incorrect. The constructor (a public member) may not be used concurrently with other public members. – Cullender 10/8, 2016 at 20:27

@Cullender Again, you need to be much more specific for the statements to have meaning. You would need to define specific usages of the type, along with the expected behavior, before you could say whether or not it satisfies those constraints. You're also assuming that just because you have an object instance of the constructor even though the constructor hasn't finished, that that is a problem. – Stocking 10/8, 2016 at 20:34

As long as the queue functions appropriately even if you have an instance who's constructor hasn't finished, then that's not a problem. If you can't observe any operation being performed on the queue happening before the construction has finished, which is likely the case given how those other operations are implemented (namely the use of volatile fields), there is no issue. – Stocking 10/8, 2016 at 20:34

@Servy: Any further specificity would defeat the purpose of the notion of thread-safety. I understand what you're saying re usages and expected behaviour, but corruption of internal state should never be an acceptable outcome for thread-safe collections. If the public members of ConcurrentQueue<T> indeed function correctly even before the constructor completes, then that's fine. But it also imposes the expectation that my ThreadSafeQueue<T> should also use volatile or memory barriers in its constructor to conform with the .NET convention for thread-safety. – Cullender 10/8, 2016 at 20:52

@Cullender Any further specificity would defeat the purpose of the notion of thread-safety. That's quite true. I thought I was pretty clear in saying it's a pretty useless term, on its own, and doesn't actually tell you anything useful. It's not about .NET conventions for thread safety, it's about what you want to do with that type and whether or not it satisfies that. Don't try to meet a completely undefined definition of "thread safe". Write a class that meets the contract you need it to meet (right now it looks like it fails to do so). – Stocking 10/8, 2016 at 21:5

@Servy: I wouldn't agree that the term "thread-safe" is useless if it at least guarantees that concurrent use won't corrupt the internal state or give rise to patently incorrect behaviour (such as a single Enqueue call adding the same item twice). Yes, as you're saying, in most cases, one needs more specificity than that (e.g. whether enumerations are moment-in-time or may include subsequent updates). But it would be quite a shortcoming on the .NET team's part if the term is as undefined as you're implying. – Cullender 10/8, 2016 at 21:15

@Cullender You've just defined, "thread safe" as "behaves correctly". (That is more or less what I defined it as above.) That's just not a particularly useful thing to say, if you have no idea what the correct behavior is, or should be, or needs to be. It's the morale equivalent of begging the question; using that term has adding nothing useful to the conversation that you didn't have without it. – Stocking 10/8, 2016 at 21:23

The specs define the minimum memory model that should be supported, but it does not mean that a stronger memory model cannot be implemented. Maybe those System.Collections.Concurrent classes just assume a memory model that is stronger than the specified because they only have know target CLR implementations. Memory models are also CPU bound. For example, desktop CPUs are unlikely to have that issue... but, maybe that is not the case anymore since I'm quite dated at CPU architectures. – Gasaway 27/2, 2017 at 16:52

Also note that a memory barrier is not needed past the start or end of a lock code block. No reads or writes are allowed to be reordered past lock block boundaries... that means that if you find lock (obj) { } you can assume no writes or reads will move from before/after the whole block to the other side, or from inside/outside to the other side. – Gasaway 27/2, 2017 at 16:54

I'll attempt to answer this interesting and well-presented question, based on the comments by Servy and Douglas, and on information coming from other related questions. What follows is just my assumptions, and not solid information from a reputable source.

Thread-safe classes have properties and methods that can be safely invoked by multiple threads concurrently, but their constructors are not thread-safe. This means that it is entirely possible for a thread to "see" an instance of a thread-safe class having an invalid state, provided that the instance is constructed concurrently by another thread.
Adding the line Thread.MemoryBarrier(); at the end of the constructor is not enough to make the constructor thread-safe, because this statement only affects the thread that runs the constructor¹. The other threads that may access concurrently the under-construction instance are not affected. Memory-visibility is cooperative, and one thread cannot change what another thread "sees" by altering the other thread's execution flow (or invalidating the local cache of the CPU-core that the other thread is running on) in a non-cooperative manner.
The correct and robust way to ensure that all threads are seeing the instance having a valid state, is to include proper memory barriers in all threads. This can be achieved by either declaring the instance as volatile, in case it is a field of a class, or otherwise using the methods of the static Volatile class:

ThreadSafeQueue<int> queue = null;

Parallel.For(0, 10000, i =>
{
    if (i == 0)
        Volatile.Write(ref queue, new ThreadSafeQueue<int>());
    else if (i % 2 == 0)
        Volatile.Read(ref queue)?.Enqueue(i);
    else
    {
        int item = -1;
        if (Volatile.Read(ref queue)?.TryDequeue(out item) == true)
            Console.WriteLine(item);
    }
});

In this particular example it would be simpler and more efficient to instantiate the queue variable before invoking the Parallel.For method. Doing so would render unnecessary the explicit Volatile invocations. The Parallel.For method is using Tasks internally, and TPL includes the appropriate memory barriers at the beginning/end of each task. Memory barriers are generated implicitly and automatically by the .NET infrastructure, by any built-in mechanism that starts a thread or causes a delegate to execute on another thread. (citation)

I'll repeat that I'm not 100% confident about the correctness of the information presented above.

¹ _{Quoting from the documentation of the Thread.MemoryBarrier method: Synchronizes memory access as follows: The processor executing the current thread cannot reorder instructions in such a way that memory accesses prior to the call to MemoryBarrier() execute after memory accesses that follow the call to MemoryBarrier().}

Cleary answered 1/1, 2021 at 16:1 Comment(8)

"Adding the line Thread.MemoryBarrier(); at the end of the constructor is not enough to make the constructor thread-safe." <- That isn't necessarily true for thread-safe classes like the one I presented. Yes, memory-visibility is cooperative, but all the other methods of the class already ensure this through their lock statements, which generate an implicit memory barrier. – Cullender 4/1, 2021 at 17:44

@Cullender your ThreadSafeQueue<T> class creates a Queue<T> in its constructor. The constructor of the Queue<T> class doesn't do much. It just assigns the static field T[] _emptyArray as the value of the private field T[] _array. But how confident you are that another thread that is handed a reference to a newly constructed ThreadSafeQueue<T>, will not see the _array having its original null value, causing a NullReferenceException? – Cleary 4/1, 2021 at 18:9

The only way of accessing _array is through the Enqueue or TryDequeue methods, both of which take a lock (and hence an implicit memory barrier) before the _array instance is accessed. That said, I'm not sure whether the implicit barrier protects against the x in lock(x) still being null, so you have a point there if that's what you meant. – Cullender 4/1, 2021 at 21:44

@Cullender good point about the memory barrier inserted implicitly by the lock. I hadn't thought about it. It may be the ingredient that allows safe access to a non-volatile variable/field that stores instances of your class. Or may not. Hopefully some expert that has studied the CLR/C# specifications and knows about this stuff, will be able to enlighten us! – Cleary 4/1, 2021 at 21:59

I've been studying this topic for a while, and the conclusion I've arrived at is that the memory model in the ECMA CLI spec is broken. However, the CLR and all other mainstream implementations of the runtime offer a stronger memory model than what is required by the spec, thus avoiding these pitfalls. – Cullender 4/1, 2021 at 22:9

@Cullender not happy to hear about this conclusion! Personally I've adopted the following advice by Igor Ostrovsky: "All code you write should rely only on the guarantees made by the ECMA C# specification, and not on any of the implementation details explained in this article." From this article. But if the spec is broken, I may have to reconsider. – Cleary 4/1, 2021 at 22:54

I haven't read Igor, but many world experts concur that the CLI model is too weak, vague, or outright broken. See Jon Skeet and Joe Duffy for two examples. – Cullender 7/1, 2021 at 18:28

One more confirmation that the official CLR implementation protects against this kind of stuff, even on ARM64: github.com/dotnet/runtime/issues/46911#issuecomment-760004625 – Lofty 14/1, 2021 at 13:38

Should thread-safe class have a memory barrier at the end of its constructor?

I do not see a reason for this. The queue is local variable that is assigned from one thread and accessed from another. Such concurrent access should be synchronized and it is responsibility of the accessing code to do so. It has nothing to do with constructor or type of the variable, such access should always be explicitly synchronized or you are entering a dangerous area even for primitive types (even if the assignment is atomic, you may get caught is some cache trap). If the access to the variable is properly synchronized, it does not need any support in the constructor.

Ithaman answered 1/1, 2021 at 17:2 Comment(1)

Without saying that this answer is incorrect, I feel that it ventures outside the context of the asked question. This question is about classes, not primitive types (which are usually structs). In this context the atomicity of assignment should be considered a fact, not something iffy. – Cleary 1/1, 2021 at 19:7

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags