How to show that the double-checked-lock pattern with Dictionary's TryGetValue is not threadsafe
Asked Answered
L

5

14

Recently I've seen some C# projects that use a double-checked-lock pattern on a Dictionary. Something like this:

private static readonly object _lock = new object();
private static volatile IDictionary<string, object> _cache = 
    new Dictionary<string, object>();

public static object Create(string key)
{
    object val;
    if (!_cache.TryGetValue(key, out val))
    {
        lock (_lock)
        {
            if (!_cache.TryGetValue(key, out val))
            {
                val = new object(); // factory construction based on key here.
                _cache.Add(key, val);
            }
        }
    }
    return val;
}

This code is incorrect, since the Dictionary can be "growing" the collection in _cache.Add() while _cache.TryGetValue (outside the lock) is iterating over the collection. It might be extremely unlikely in many situations, but is still wrong.

Is there a simple program to demonstrate that this code fails?

Does it make sense to incorporate this into a unit test? And if so, how?

Labanna answered 12/4, 2010 at 18:15 Comment(0)
L
14

In this example, exception #1 is thrown almost instantly on my machine:

var dict = new Dictionary<int, string>() { { 1234, "OK" } };

new Thread(() =>
{
    for (; ; )
    {
        string s;
        if (!dict.TryGetValue(1234, out s))
        {
            throw new Exception();  // #1
        }
        else if (s != "OK")
        {
            throw new Exception();  // #2
        }
    }
}).Start();

Thread.Sleep(1000);
Random r = new Random();
for (; ; )
{
    int k;
    do { k = r.Next(); } while (k == 1234);
    Debug.Assert(k != 1234);
    dict[k] = "FAIL";
}

However, the exact behaviour of code that is not designed to be thread-safe is unpredictable.
You cannot rely on it. So the double-checking code is indeed broken.

I'm not sure if I'd unit test this, though, as testing concurrent code (and getting it right) is much more complicated than writing the concurrent code in the first place.

Lamellate answered 12/4, 2010 at 18:33 Comment(5)
@Labanna Yes, Dual Core and Dual Quad Core machines... which one is it going to fail on?Ridden
Fails almost instantly on my (quad core) machine as well. Good test, +1.Umber
Agreed... great test. Sucks that C# and most other languages don't lend themselves to testing concurrent code easily. (Not that it could ever be that easy)Labanna
@Catalin: The double-checked lock is only there to prevent multiple writes. dtb's code only has one writer, so it doesn't make a difference.Umber
@Lamellate I tried your updated version and it did fail for me too :)... you were right +1!Ridden
U
20

Clearly the code is not threadsafe. What we have here is a clear case of the hazards of premature optimization.

Remember, the purpose of the double-checked locking pattern is to improve the performance of code by eliminating the cost of the lock. If the lock is uncontested it is incredibly cheap already. Therefore, the double-checked locking pattern is justified only in the cases (1) where the lock is going to be heavily contested, or (2) where the code is so incredibly performance-sensitive that the cost of an unconstested lock is still too high.

Clearly we are not in the second case. You're using a dictionary for heaven's sake. Even without the lock it will be doing lookups and comparisons that will be hundreds or thousands of times more expensive than the savings of avoiding an uncontested lock.

If we are in the first case then figure out what is causing the contention and eliminate that. If you're doing a lot of waiting around on a lock then figure out why that is and replace the locking with a slim reader-writer-lock or restructure the application so that not so many threads are banging on the same lock at the same time.

In either case there is no justification for doing dangerous, implementation-sensitive low-lock techniques. You should only be using low-lock techniques in those incredibly rare cases where you really, truly cannot take the cost of an uncontested lock.

Unbalanced answered 12/4, 2010 at 20:37 Comment(2)
In general I agree with everything you said, and I don't think people should prematurely optimize. But I'm not sure about your assessment of the performance of Dictionary. If it's small and being called often, it is in the processor's (or core's) cache, and there is no cross processor communication... VERY fast. But lock has to have at least one LOCK prefix (assembler), so it might have to slow down to the bus speed... fast, but not as fast. On my machine an empty lock(){} is twice as slow as TryGetValue<int,int>(). Try it. One solution for this problem is copy-on-write (for small dicts).Labanna
@Amir: You make an excellent point. However, note that getting the hash code of a 32 bit int -- a virtual method on a sealed type which is known to the jitter to be an identity function -- can be optimized away; getting the hash code is essentially O(0) and int comparisons are a small number of instructions. The question asks about a dictionary containing strings, which have an O(n) hash algorithm and an O(n) comparison operator.Unbalanced
L
14

In this example, exception #1 is thrown almost instantly on my machine:

var dict = new Dictionary<int, string>() { { 1234, "OK" } };

new Thread(() =>
{
    for (; ; )
    {
        string s;
        if (!dict.TryGetValue(1234, out s))
        {
            throw new Exception();  // #1
        }
        else if (s != "OK")
        {
            throw new Exception();  // #2
        }
    }
}).Start();

Thread.Sleep(1000);
Random r = new Random();
for (; ; )
{
    int k;
    do { k = r.Next(); } while (k == 1234);
    Debug.Assert(k != 1234);
    dict[k] = "FAIL";
}

However, the exact behaviour of code that is not designed to be thread-safe is unpredictable.
You cannot rely on it. So the double-checking code is indeed broken.

I'm not sure if I'd unit test this, though, as testing concurrent code (and getting it right) is much more complicated than writing the concurrent code in the first place.

Lamellate answered 12/4, 2010 at 18:33 Comment(5)
@Labanna Yes, Dual Core and Dual Quad Core machines... which one is it going to fail on?Ridden
Fails almost instantly on my (quad core) machine as well. Good test, +1.Umber
Agreed... great test. Sucks that C# and most other languages don't lend themselves to testing concurrent code easily. (Not that it could ever be that easy)Labanna
@Catalin: The double-checked lock is only there to prevent multiple writes. dtb's code only has one writer, so it doesn't make a difference.Umber
@Lamellate I tried your updated version and it did fail for me too :)... you were right +1!Ridden
U
8

I don't really think that you need to prove this, you just need to refer people to the documentation for Dictionary<TKey, TValue>:

A Dictionary can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread-safe procedure. In the rare case where an enumeration contends with write accesses, the collection must be locked during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.

It's actually a well-known fact (or should be) that you cannot read from a dictionary while another thread is writing to it. I've seen a few "bizarre multi-threading issue" kinds of questions here on SO where it turned out that the author didn't realize that this wasn't safe.

The problem isn't specifically related to double-checked locking, it's just that the dictionary is not a thread-safe class, not even for a single-writer/single-reader scenario.


I'll go one step further and show you why, in Reflector, this isn't thread-safe:

private int FindEntry(TKey key)
{
    // Snip a bunch of code
    for (int i = this.buckets[num % this.buckets.Length]; i >= 0;
        i = this.entries[i].next)
    // Snip a bunch more code
}

private void Resize()
{
    int prime = HashHelpers.GetPrime(this.count * 2);
    int[] numArray = new int[prime];
    // Snip a whole lot of code
    this.buckets = numArray;
}

Look at what can happen if the Resize method happens to be running while even one reader calls FindEntry:

  1. Thread A: Adds an element, resulting in a dynamic resize;
  2. Thread B: Calculates the bucket offset as (hash code % bucket count);
  3. Thread A: Changes the buckets to have a different (prime) size;
  4. Thread B: Chooses an element index from the new bucket array at the old bucket index;
  5. Thread B's pointer is no longer valid.

And this is exactly what fails in dtb's example. Thread A searches for a key that is known in advance to be in the dictionary, and yet it isn't found. Why? Because the FindValue method picked what it thought was the correct bucket, but before it even had a chance to look inside, Thread B changed the buckets, and now Thread A is looking in some totally random bucket that does not contain or even lead to the right entry.

Moral of the story: TryGetValue is not an atomic operation, and Dictionary<TKey, TValue> is not a thread-safe class. It's not just concurrent writes you need to worry about; you can't have concurrent read-writes either.

In reality the problem actually runs a lot deeper than this, due to instruction reordering by the jitter and CPU, stale caches, etc. - there are no memory barriers whatsoever being used here - but this should prove beyond a doubt that there's an obvious race condition if you have an Add invocation running at the same time as a TryGetValue invocation.

Umber answered 12/4, 2010 at 18:51 Comment(2)
I agree, yet developers do misuse collections, and sometimes are stubborn about you proving it. They also think that the double lock is the implementation that makes it thread-safe. Just look at the comments and answers.Labanna
@Amir: I definitely see your point. It seems as though people believe that TryGetValue is somehow atomic, even though it's nothing of the sort. The Resize method changes all the buckets around, which the FindEntry method needs at the beginning of its execution.Umber
P
3

The reason I guess this question comes up again and again:

Pre-2.0, Before Generics (B.G.), Hashtable was the primary associative container in .NET, which indeed provides some threading guarantees. From MSDN:
"Hashtable is thread safe for use by multiple reader threads and a single writing thread. It is thread safe for multi-thread use when only one of the threads perform write (update) operations, which allows for lock-free reads provided that the writers are serialized to the Hashtable."

Before anyone gets extremely excited, there are some limitations.
See e.g. this post from Brad Abrams, who owns Hashtable.
Some more historical background on Hashtable can be found here (...near the end: "After this lengthy diversion - What about Hashtable?").

Why Dictionary<TKey, TValue> fails in the above case:

To prove that it fails, it is enough to find one example, so I'll try just that.
A resize happens as the table grows.
On resize, a rehash happens and one sees this as the last two lines:

this.buckets = newBuckets;
//One of the problems here.
this.entries = newEntries;

The buckets array holds indexes into the entries array. Let's say we have 10 entries so far and right now we are adding a new.
Let's further pretend for the sake of simplicity that we did not and will not get collisions.
In the old buckets, we had indexes running from 0 to 9 - if we had no collisions.
Now the indexes in the new buckets array run from 0 to 10(!).
We now change the private buckets field to point to the new buckets.
If there is a reader doing TryGetValue() at this moment, it uses the new buckets to get the index, but then uses the new index to read into the old entries array, since the entries field still points to the old entries.
One of the things one can get - besides false reads - is a friendly IndexOutOfRangeException.
Another "great" way to get this is in @Aaronaught's explanation. (...and both can happen, e.g. as in dtb's example).

This is really just one example, Dictonary was not designed and never meant to be thread-safe. It was designed to be fast, however - that means that the lock will not be held for long.

Pettifog answered 12/4, 2010 at 18:15 Comment(0)
L
1

Including the code in the question, you can test it with the following code.

//using System.Collections.Generic;
//using System.Threading;

private static volatile int numRunning = 2;
private static volatile int spinLock = 0;

static void Main(string[] args)
{
    new Thread(TryWrite).Start();
    new Thread(TryWrite).Start();
}

static void TryWrite()
{
    while(true) 
    {
        for (int i = 0; i < 1000000; i++ )
        {
            Create(i.ToString());
        }

        Interlocked.Decrement(ref numRunning);
        while (numRunning > 0) { } // make sure every thread has passed the previous line before proceeding (call this barrier 1)

        while (Interlocked.CompareExchange(ref spinLock, 1, 0) != 0){Thread.Sleep(0);} // Aquire lock (spin lock)
        // only one thread can be here at a time...

        if (numRunning == 0) // only the first thread to get here executes this...
        {
            numRunning = 2; // resets barrier 1
            // since the other thread is beyond the barrier, but is waiting on the spin lock,
            //  nobody is accessing the cache, so we can clear it...
            _cache = new Dictionary<string, object>(); // clear the cache... 
        }

        spinLock = 0; // release lock...
    }

}

This program just tries to get Create to traverse the collection as it is being "grown". It should be run on a machine with at least two cores (or two processors), and will most likely fail after a while with this exception.

System.Collections.Generic.Dictionary`2.FindEntry(TKey key)

Adding this test is difficult since it is a probabilistic test, and you don't know how long it will take to fail (if ever). I guess you could pick a value like 10 seconds and let it run for that long. If it doesn't fail in that amount of time, then the test passes. Not the best, but something. You should also verify that Environment.ProcessorCount > 1 before running the test, otherwise the likelihood of it failing is minuscule.

Labanna answered 12/4, 2010 at 18:42 Comment(6)
@Amir, it wouldn't fail... there is no reason for it to fail.Ridden
@Lirik: Why do you believe that concurrent accesses to an unsynchronized data structure does not fail?Lamellate
@Lamellate It depends on what you mean by fail? It can miss a value, but it will catch it in the synchronized block which follows immediately after. Will it throw an exception on a write?Ridden
@Lirik: The behaviour of methods modifying a data structure without a lock is unpredictable. You cannot rely on unspecified behaviour just because you happen to observe it on your machine. A mathematical proof that some implementation produces the expected result in all possible interleavings is very difficult and such an implementation is very hard to get right.Lamellate
@Lamellate This is an optimistic example... there is no guarantee that the user will find a value at any given time, but no such guarantee is necessary with the example the OP provided.Ridden
@Lirik: I think it's quite important if the method when asked for key X returns the value associated with key X or key Y (or null). The method should return the value associated with X under any circumstance, which is not the case if you rely on unspecified, unpredictable behaviour.Lamellate

© 2022 - 2024 — McMap. All rights reserved.