System.Lazy<T> with different thread-safety mode [duplicate]
Asked Answered
M

5

14

.NET 4.0's System.Lazy<T> class offers three Thread-Safety modes via the enum LazyThreadSafetyMode, which I'll summarise as:

  • LazyThreadSafetyMode.None - Not thread safe.
  • LazyThreadSafetyMode.ExecutionAndPublication - Only one concurrent thread will attempt to create the underlying value. On successful creation, all waiting threads will receive the same value. If an unhandled exception occurs during creation, it will be re-thrown on each waiting thread, cached and re-thrown on each subsequent attempt to access the underlying value.
  • LazyThreadSafetyMode.PublicationOnly - Multiple concurrent threads will attempt to create the underlying value but the first to succeed will determine the value passed to all threads. If an unhandled exception occurs during creation, it will not be cached and concurrent & subsequent attempts to access the underlying value will re-try the creation & may succeed.

I'd like to have a lazy-initialized value which follows slightly different thread-safety rules, namely:

Only one concurrent thread will attempt to create the underlying value. On successful creation, all waiting threads will receive the same value. If an unhandled exception occurs during creation, it will be re-thrown on each waiting thread, but it will not be cached and subsequent attempts to access the underlying value will re-try the creation & may succeed.

So the key differince with LazyThreadSafetyMode.ExecutionAndPublication is that if a "first go" at creation fails, it can be re-attempted at a later time.

Is there an existing (.NET 4.0) class that offers these semantics, or will I have to roll my own? If I roll my own is there a smart way to re-use the existing Lazy<T> within the implementation to avoid explicit locking/synchronization?


N.B. For a use case, imagine that "creation" is potentially expensive and prone to intermittent error, involving e.g. getting a large chunk of data from a remote server. I wouldn't want to make multiple concurrent attempts to get the data since they'll likely all fail or all succeed. However, if they fail, I'd like to be able to retry later on.

Medwin answered 30/12, 2015 at 12:24 Comment(8)
"Retry later on" is horribly vague. You could use a Timer that recreates the Lazy instance, perhaps.Buttonwood
Hi @HansPassant. I don't mean that the Lazy<T> itself should retry later on. I mean that if the user calls myLazy.value multiple times, then if it fails the first time, on the second call it will try again to instantiate the underlying value rather than simply rethrowing the previous exception.Medwin
When is "later on"? Is it "any time after the main thread has thrown its exception and all waiting threads have been unblocked by observing it"? Is it managed by whatever caller(s) is/are observing your hypothetical LazyWithSprinkles<T>? It sounds like there's a problem that's slightly bigger than what you've posted, which suggests a much different solution than something that looks similar to Lazy<T>.Lacombe
For example, perhaps you could have Lazy<Task<T>> that, upon the first request, spins up a Task<T> that makes the request repeatedly until it finally succeeds (or throws an error that it doesn't know how to recover from); callers could then wait on that Task<T> for as long as they can justify. It's not exactly the same as what you're looking for, though, because sometimes you'll timeout 100ms before the value would have been ready... but is that actually different from having to retry later if an exception is thrown in your Lazy<T>-like idea?Lacombe
(another way it's different is that if all callers get dejected after seeing the first exception and never want to try again, this will waste resources on a background thread retrying perhaps indefinitely... again, not knowing a bigger-picture vision of the use case, it's hard for me to say whether or not that's OK or even something to worry about)Lacombe
Good question. Basically if the thread which attempts creation has thrown an exception and passed that to its caller, then the next subsequent call to Value would initiate a new attempt at creation. I'm not really concerned with whether all waiting threads have fully returned/thrown before a new attempt at creation is made, only that we don't make a new attempt at creation while an existing one has not yet succeeded or failed.Medwin
Regarding your idea for a background task to keep trying to initialize the value I can see why it might fit some use cases, but I'd rather stick to the idea of the initialization being done synchronously (at least, synchrounously with one of the threads accessing the LazyWithKnobsOn<T>)Medwin
I am voting to close this question as a duplicate of the Lazy<T> without exception caching, because the other question is lightly older and has a more descriptive title.Coadunate
L
1

My attempt at a version of Darin's updated answer that doesn't have the race condition I pointed out... warning, I'm not completely sure this is finally completely free of race conditions.

private static int waiters = 0;
private static volatile Lazy<object> lazy = new Lazy<object>(GetValueFromSomewhere);
public static object Value
{
    get
    {
        Lazy<object> currLazy = lazy;
        if (currLazy.IsValueCreated)
            return currLazy.Value;

        Interlocked.Increment(ref waiters);

        try
        {
            return lazy.Value;

            // just leave "waiters" at whatever it is... no harm in it.
        }
        catch
        {
            if (Interlocked.Decrement(ref waiters) == 0)
                lazy = new Lazy<object>(GetValueFromSomewhere);
            throw;
        }
    }
}

Update: I thought I found a race condition after posting this. The behavior should actually be acceptable, as long as you're OK with a presumably rare case where some thread throws an exception it observed from a slow Lazy<T> after another thread has already returned from a successful fast Lazy<T> (future requests will all succeed).

  • waiters = 0
  • t1: comes in runs up to just before the Interlocked.Decrement (waiters = 1)
  • t2: comes in and runs up to just before the Interlocked.Increment (waiters = 1)
  • t1: does its Interlocked.Decrement and prepares to overwrite (waiters = 0)
  • t2: runs up to just before the Interlocked.Decrement (waiters = 1)
  • t1: overwrites lazy with a new one (call it lazy1) (waiters = 1)
  • t3: comes in and blocks on lazy1 (waiters = 2)
  • t2: does its Interlocked.Decrement (waiters = 1)
  • t3: gets and returns the value from lazy1 (waiters is now irrelevant)
  • t2: rethrows its exception

I can't come up with a sequence of events that will cause something worse than "this thread threw an exception after another thread yielded a successful result".

Update2: declared lazy as volatile to ensure that the guarded overwrite is seen by all readers immediately. Some people (myself included) see volatile and immediately think "well, that's probably being used incorrectly", and they're usually right. Here's why I used it here: in the sequence of events from the example above, t3 could still read the old lazy instead of lazy1 if it was positioned just before the read of lazy.Value the moment that t1 modified lazy to contain lazy1. volatile protects against that so that the next attempt can start immediately.

I've also reminded myself why I had this thing in the back of my head saying "low-lock concurrent programming is hard, just use a C# lock statement!!!" the entire time I was writing the original answer.

Update3: just changed some text in Update2 pointing out the actual circumstance that makes volatile necessary -- the Interlocked operations used here are apparently implemented full-fence on the important CPU architectures of today and not half-fence as I had originally just sort-of assumed, so volatile protects a much narrower section than I had originally thought.

Lacombe answered 30/12, 2015 at 15:11 Comment(3)
Seems like an elegant solution, and I don't mind the race condition you describe. However, if we really wanted to exclude it, would it be sufficient to add a static object syncRoot = new object(); and change Interlocked.Increment(ref _waiters); to lock(syncRoot) {++waiters;} and also change the if-statement to lock(syncRoot) { if(--waiters == 0) _lazy = new Lazy<object>(GetValueFromSomewhere); }?Medwin
@MattBecker82: I don't think that helps. At best (not sure about this one), it lets us remove volatile from the field declaration because Monitor.Enter does a full memory barrier. I can't work out how to prevent this condition in a reasonable way -- I think it would force us to wait for all currently executing catch handlers, if any, to finish (yet another variable because waiters could all be waiting for a successful run) immediately after anything reads !currLazy.IsValueCreated, and then read lazy again. It's doable, but increases complexity for questionable benefit.Lacombe
Ok, I'll mark this as correct as I feel it's the most elegant of the bunch and the caveat about the known race condition is clear enough. Props also to Darin's and Damien's answers. Thanks all.Medwin
A
4

Only one concurrent thread will attempt to create the underlying value. On successful creation, all waiting threads will receive the same value. If an unhandled exception occurs during creation, it will be re-thrown on each waiting thread, but it will not be cached and subsequent attempts to access the underlying value will re-try the creation & may succeed.

Since Lazy doesn't support that, you could try to roll it on your own:

private static object syncRoot = new object();
private static object value = null;
public static object Value
{
    get
    {
        if (value == null)
        {
            lock (syncRoot)
            {
                if (value == null)
                {
                    // Only one concurrent thread will attempt to create the underlying value.
                    // And if `GetTheValueFromSomewhere` throws an exception, then the value field
                    // will not be assigned to anything and later access
                    // to the Value property will retry. As far as the exception
                    // is concerned it will obviously be propagated
                    // to the consumer of the Value getter
                    value = GetTheValueFromSomewhere();
                }
            }
        }
        return value;
    }
}

UPDATE:

In order to meet your requirement about same exception propagated to all waiting reader threads:

private static Lazy<object> lazy = new Lazy<object>(GetTheValueFromSomewhere);
public static object Value
{
    get
    {
        try
        {
            return lazy.Value;
        }
        catch
        {
            // We recreate the lazy field so that subsequent readers
            // don't just get a cached exception but rather attempt
            // to call the GetTheValueFromSomewhere() expensive method
            // in order to calculate the value again
            lazy = new Lazy<object>(GetTheValueFromSomewhere);

            // Re-throw the exception so that all blocked reader threads
            // will get this exact same exception thrown.
            throw;
        }
    }
}
Allopath answered 30/12, 2015 at 13:49 Comment(12)
Misses the "will be re-thrown on each waiting thread" bit, but I'm guessing that's not a real requirement anyway.Lacombe
That's true, actually if 2 threads are trying to read the value at the same time while it is being assigned to (assuming that the GetTheValueFromSomewhere will throw an exception), both threads will get 2 different exceptions because both of them will hit the GetTheValueFromSomewhere method eventually. So basically the second reading thread in this example will retry recreating the value.Allopath
Joe - it is a real requirement. If there are multiple concurrent threads calling Value, and the first one to get the lock calls GetTheValueFromSomewhere() which throws an exception, I don't want all the other waiting threads to go and call GetTheValueFromSomewhere() too as it's potentially expensive.Medwin
@MattBecker82, that's a very good point and a perfectly valid requirement. I have updated my answer to illustrate a possible solution to it by using Lazy and recreating the static field instance in case of an exception to avoid caching this exception to all future readers.Allopath
Is there a reason for syncRoot in the update? It seems like it's either unnecessary, not enough to solve whatever bad thing it's supposed to be protecting us from (unless for some reason writes to reference-type fields are not atomic, but reads are), or both.Lacombe
Not thread-safe. Two exceptions might enter the catch at the same time. Then, the lazy will be recreated two times.Curtin
Yes, but they won't be 2 different exceptions but rather the exact same instance (because of the Lazy caching) which is what the OP wants. And it is this same instance that will be re-thrown on all blocked at this moment readers.Allopath
syncLock, as it's currently being used, does nothing to prevent any of the bad things mentioned in the previous two comments. All it does is make sure all the waiting threads have to form a single-file line to overwrite the lazy variable with their own brand new instances, one after the other. There's actually a race condition where one of the waiting threads has overwritten lazy, another caller comes in and blocks on that one, and then a later waiting thread overwrites lazy with its own new instance. You need at least another static variable to overcome this, I think.Lacombe
I agree with you. The lock is not needed in this case. Rewriting the lazy field multiple times should not be an issue. The important part is that all reading threads that entered the catch clause at the same time will get the same original exception.Allopath
@DarinDimitrov: I've added a version of this as another answer that doesn't have the race condition I pointed out above.Lacombe
No, I mean multiple Lazy instances can be published and running concurrently in case an exception is thrown. I don't know of a simple fix for this.Curtin
The use of lazy = new Lazy<object>(GetTheValueFromSomewhere); here is dangerous since it introduces some fun race conditions. Threads 1,2,3 and 4 may run at roughly the same time. Threads 1 and 2 get into the catch block. Thread 1 overwrites the Lazy. Thread 3 accesses the Lazy successfully (and for some reason it doesn't throw an exception). Thread 2 overwrites the Lazy. Thread 4 accesses the Lazy (and for some reason it throws an exception). The dev sits scratching their head as to how 3 was successful and yet 4 wasn't. Consider using Interlocked.CompareExchange to mitigate this.Wireman
W
3

Something like this might help:

using System;
using System.Threading;

namespace ADifferentLazy
{
    /// <summary>
    /// Basically the same as Lazy with LazyThreadSafetyMode of ExecutionAndPublication, BUT exceptions are not cached 
    /// </summary>
    public class LazyWithNoExceptionCaching<T>
    {
        private Func<T> valueFactory;
        private T value = default(T);
        private readonly object lockObject = new object();
        private bool initialized = false;
        private static readonly Func<T> ALREADY_INVOKED_SENTINEL = () => default(T);

        public LazyWithNoExceptionCaching(Func<T> valueFactory)
        {
            this.valueFactory = valueFactory;
        }

        public bool IsValueCreated
        {
            get { return initialized; }
        }

        public T Value
        {
            get
            {
                //Mimic LazyInitializer.EnsureInitialized()'s double-checked locking, whilst allowing control flow to clear valueFactory on successful initialisation
                if (Volatile.Read(ref initialized))
                    return value;

                lock (lockObject)
                {
                    if (Volatile.Read(ref initialized))
                        return value;

                    value = valueFactory();
                    Volatile.Write(ref initialized, true);
                }
                valueFactory = ALREADY_INVOKED_SENTINEL;
                return value;
            }
        }
    }
}
Wireman answered 2/3, 2017 at 22:39 Comment(2)
Can you illustrate a bit why this exceptions are not cached?Weld
@KininRoza Since if there is an exception, value is not written to.Wireman
C
2

Lazy does not support this. This is a design problem with Lazy because exception "caching" means that that lazy instance will not provide a real value forever. This can bring applications down permanently due to transient errors such as network problems. Human intervention is usually required then.

I bet this landmine exists in quite a few .NET apps...

You need to write your own lazy to do this. Or, open a CoreFx Github issue for this.

Curtin answered 30/12, 2015 at 13:43 Comment(6)
Why do you feel that this is a design problem with Lazy<T>? If you assume that the role of Lazy<T> is so that it executes the value factory approximately once and caches the result for each time its value is requested, then why shouldn't that include throwing unhandled exceptions if you gave it a value factory that doesn't handle recoverable exceptions?Lacombe
Thanks, usr. I realise Lazy<T> doesn't support this so my question was really whether there is another existing mechanism which does support this behaviour (or which can easily be adapted to). Sorry if this wasn't clear.Medwin
The answer is no, I should have called that out. @Medwin Not built it. A shame.Curtin
@JoeAmenta in all this time I have never wanted this. If you want that you are kind of using exceptions for control flow. So it seems unlikely that someone might want this. If you want it, simply catch the exception in the lazy body and return it as a value. There you go.Curtin
@JoeAmenta It depends whether your expectation is that Lazy will invoke the value factory once or once successfully. Many developers assume it does the latter (incorrectly).Wireman
Lazy is indeed missing an 'ExecutionRetryWithPublication' mode - otoh, it was a fun day to write (and write strong tests for!) an "improved" Lazy implementation, sadly no base interface/class can be used for type signatures :}Installation
I
1

Partially inspired by Darin's answer, but trying to get this "queue of waiting threads that are inflicted with the exception" and the "try again" features working:

private static Task<object> _fetcher = null;
private static object _value = null;

public static object Value
{
    get
    {
        if (_value != null) return _value;
        //We're "locking" then
        var tcs = new TaskCompletionSource<object>();
        var tsk = Interlocked.CompareExchange(ref _fetcher, tcs.Task, null);
        if (tsk == null) //We won the race to set up the task
        {
            try
            {
                var result = new object(); //Whatever the real, expensive operation is
                tcs.SetResult(result);
                _value = result;
                return result;
            }
            catch (Exception ex)
            {
                Interlocked.Exchange(ref _fetcher, null); //We failed. Let someone else try again in the future
                tcs.SetException(ex);
                throw;
            }
        }
        tsk.Wait(); //Someone else is doing the work
        return tsk.Result;
    }
}

I am slightly concerned though - can anyone see any obvious races here where it will fail in an unobvious way?

Injudicious answered 30/12, 2015 at 14:20 Comment(1)
I can't come up with a sequence of operations that would cause this to do... pretty much any undesirable thing. Just a note that tsk.Result does its own wait if the value isn't ready, so the explicit tsk.Wait() is redundant.Lacombe
L
1

My attempt at a version of Darin's updated answer that doesn't have the race condition I pointed out... warning, I'm not completely sure this is finally completely free of race conditions.

private static int waiters = 0;
private static volatile Lazy<object> lazy = new Lazy<object>(GetValueFromSomewhere);
public static object Value
{
    get
    {
        Lazy<object> currLazy = lazy;
        if (currLazy.IsValueCreated)
            return currLazy.Value;

        Interlocked.Increment(ref waiters);

        try
        {
            return lazy.Value;

            // just leave "waiters" at whatever it is... no harm in it.
        }
        catch
        {
            if (Interlocked.Decrement(ref waiters) == 0)
                lazy = new Lazy<object>(GetValueFromSomewhere);
            throw;
        }
    }
}

Update: I thought I found a race condition after posting this. The behavior should actually be acceptable, as long as you're OK with a presumably rare case where some thread throws an exception it observed from a slow Lazy<T> after another thread has already returned from a successful fast Lazy<T> (future requests will all succeed).

  • waiters = 0
  • t1: comes in runs up to just before the Interlocked.Decrement (waiters = 1)
  • t2: comes in and runs up to just before the Interlocked.Increment (waiters = 1)
  • t1: does its Interlocked.Decrement and prepares to overwrite (waiters = 0)
  • t2: runs up to just before the Interlocked.Decrement (waiters = 1)
  • t1: overwrites lazy with a new one (call it lazy1) (waiters = 1)
  • t3: comes in and blocks on lazy1 (waiters = 2)
  • t2: does its Interlocked.Decrement (waiters = 1)
  • t3: gets and returns the value from lazy1 (waiters is now irrelevant)
  • t2: rethrows its exception

I can't come up with a sequence of events that will cause something worse than "this thread threw an exception after another thread yielded a successful result".

Update2: declared lazy as volatile to ensure that the guarded overwrite is seen by all readers immediately. Some people (myself included) see volatile and immediately think "well, that's probably being used incorrectly", and they're usually right. Here's why I used it here: in the sequence of events from the example above, t3 could still read the old lazy instead of lazy1 if it was positioned just before the read of lazy.Value the moment that t1 modified lazy to contain lazy1. volatile protects against that so that the next attempt can start immediately.

I've also reminded myself why I had this thing in the back of my head saying "low-lock concurrent programming is hard, just use a C# lock statement!!!" the entire time I was writing the original answer.

Update3: just changed some text in Update2 pointing out the actual circumstance that makes volatile necessary -- the Interlocked operations used here are apparently implemented full-fence on the important CPU architectures of today and not half-fence as I had originally just sort-of assumed, so volatile protects a much narrower section than I had originally thought.

Lacombe answered 30/12, 2015 at 15:11 Comment(3)
Seems like an elegant solution, and I don't mind the race condition you describe. However, if we really wanted to exclude it, would it be sufficient to add a static object syncRoot = new object(); and change Interlocked.Increment(ref _waiters); to lock(syncRoot) {++waiters;} and also change the if-statement to lock(syncRoot) { if(--waiters == 0) _lazy = new Lazy<object>(GetValueFromSomewhere); }?Medwin
@MattBecker82: I don't think that helps. At best (not sure about this one), it lets us remove volatile from the field declaration because Monitor.Enter does a full memory barrier. I can't work out how to prevent this condition in a reasonable way -- I think it would force us to wait for all currently executing catch handlers, if any, to finish (yet another variable because waiters could all be waiting for a successful run) immediately after anything reads !currLazy.IsValueCreated, and then read lazy again. It's doable, but increases complexity for questionable benefit.Lacombe
Ok, I'll mark this as correct as I feel it's the most elegant of the bunch and the caveat about the known race condition is clear enough. Props also to Darin's and Damien's answers. Thanks all.Medwin

© 2022 - 2024 — McMap. All rights reserved.