Can another thread see partially created collection when using collection initializer?
Asked Answered
M

2

6

Imagine this C# code in some method:

SomeClass.SomeGlobalStaticDictionary = new Dictionary<int, string>()
{
    {0, "value"},
};

Let's say no one is using any explicit memory barriers or locking to access the dictionary.

If no optimization takes place, then the global dictionary should be either null (initial value) or a properly constructed dictionary with one entry.

The question is: Can the effect of the Add call and assigning to SomeGlobalStaticDictionary be reordered such that some other thread would see an empty non-null SomeGlobalStaticDictionary (or any other invalid partially constructed dictionary?)

Does the answer change if SomeGlobalStaticDictionary is volatile?

After reading http://msdn.microsoft.com/en-us/magazine/jj863136.aspx (and also its second part) I learned that in theory just because one variable is assigned in source code other threads might see it differently due to many reasons. I looked at the IL code but the question is whether the JIT compiler and/or CPU are allowed to not "flush" the effect of the Add call to other threads before the assignment of the SomGlobalStaticDictionary.

Medan answered 22/4, 2013 at 16:20 Comment(3)
I suspect that the answer to both questions is yes in the presence of multiple threads, but I do not understand the memory model well enough to be certain.Mezzo
Consider reading amazon.com/CLR-via-Microsoft-Developer-Reference/dp/0735667454Mezzo
This code is not legal; are you sure you want var in there? Locals aren't volatile and they never have dots in their names; are you intending that to be a field?Levey
L
4

Let me start by saying that I do not know the answer to your question, but I can help you simplify it down to its essence:

unsafe class C
{
    static int x;  // Assumed to be initialized to zero
    static int *p; // Assumed to be initialized to null
    static void M()
    {
        int* t = &C.x;
        *t = 1;
        C.p = t;
    }
    ...

Here int is standing in for the dictionary, p is standing in for your field that references a dictionary, t is the temporary created, and adding an element to the dictionary is modeled as mutating the value of field x. So the sequence of events here is: obtain storage for the dictionary and save that in a temporary, then mutate the thing referred to, and then publish the result.

The question is whether under the C# memory model, an observer on another thread is permitted to see that C.p is pointing to x and that x is still zero.

Like I said, I do not know for certain the answer to that; I would be interested to find out.

Off the top of my head though: why should that not be possible? p and x can be on completely different pages of memory. Suppose on some processor the value of x has been pre-fetched but p has not. Could that processor observe that p is not null but x is still zero? What's stopping that?

Levey answered 22/4, 2013 at 17:18 Comment(3)
Thanks. Yes, that's the essence of the question. However, I'd like the answer specifically for collection initializers i.e. the two of them might have different answers if the C# or .NET spec says that a memory barrier should be generated just for collection initializers (I can't find such thing so currently I assume it's not). So, is looking at the generated IL code enough (through e.g. ildasm) or can the JIT compiler apply different rules "just for collection initializers".Medan
@Palo: They don't - the JIT doesn't know anything about collection initializers, and the C# spce makes no collection-initializer-specific guarantees.Schaeffer
@Palo: The only guarantee that the C# specification makes about visibility of stuff that goes across threads is about special side effects like writing volatile fields or taking out locks or starting threads. Collection initializers don't come into it.Levey
S
6

In local variables, with optimization turned on, the compiler will (at least sometimes) compile to code which first assigns to the variable, then calls Add (or sets properties, for object initializers).

If you use a static or an instance variable, you'll see different behaviour:

class Test
{
    static List<int> StaticList = new List<int> { 1 };
    List<int> InstanceList = new List<int> { 2 };
}

Gives the following type initializer IL:

.method private hidebysig specialname rtspecialname static 
        void  .cctor() cil managed
{
  // Code size       21 (0x15)
  .maxstack  2
  .locals init (class [mscorlib]System.Collections.Generic.List`1<int32> V_0)
  IL_0000:  newobj     instance void class [mscorlib]System.Collections.Generic.List`1<int32>::.ctor()
  IL_0005:  stloc.0
  IL_0006:  ldloc.0
  IL_0007:  ldc.i4.1
  IL_0008:  callvirt   instance void class [mscorlib]System.Collections.Generic.List`1<int32>::Add(!0)
  IL_000d:  nop
  IL_000e:  ldloc.0
  IL_000f:  stsfld     class [mscorlib]System.Collections.Generic.List`1<int32> Test::StaticList
  IL_0014:  ret
} // end of method Test::.cctor

And the following constructor IL:

.method public hidebysig specialname rtspecialname 
        instance void  .ctor() cil managed
{
  // Code size       29 (0x1d)
  .maxstack  3
  .locals init (class [mscorlib]System.Collections.Generic.List`1<int32> V_0)
  IL_0000:  ldarg.0
  IL_0001:  newobj     instance void class [mscorlib]System.Collections.Generic.List`1<int32>::.ctor()
  IL_0006:  stloc.0
  IL_0007:  ldloc.0
  IL_0008:  ldc.i4.2
  IL_0009:  callvirt   instance void class [mscorlib]System.Collections.Generic.List`1<int32>::Add(!0)
  IL_000e:  nop
  IL_000f:  ldloc.0
  IL_0010:  stfld      class [mscorlib]System.Collections.Generic.List`1<int32> Test::InstanceList
  IL_0015:  ldarg.0
  IL_0016:  call       instance void [mscorlib]System.Object::.ctor()
  IL_001b:  nop
  IL_001c:  ret
} // end of method Test::.ctor

In both cases, the collection is populated before the field is set. Now that's not to say that there may not still be memory model issues, but it's not the same as the field being set to refer to an empty collection and then the Add call being made. From the perspective of the assigning thread, the assignment happens after the Add.

In general, both object initializer and collection initializer expressions are equivalent to constructing the object using a temporary variable - so in the case where you use it in an assignment, the property setters are all called before the assignment takes place.

However, I don't believe any special guarantees are given around visibility to other threads for object/collection initializers. I would suggest that you imagine what the code would look like if written out "long-hand" according to the specification, and then reason from there.

There are guarantees given for static initializers and constructors - but primarily within the Microsoft implementation of .NET rather than "general" guarantees (e.g. within the C# specification or the ECMA spec).

Schaeffer answered 22/4, 2013 at 16:26 Comment(5)
SomeClass.SomeGlobalStaticDictionary; he's trying to ask about a static field. And, as I understand, he's specifically asking about memory reordering.Mezzo
@SLaks: Then the var part is misleading to start with. I suspect the OP is a little confused in general, but I'll clarify.Schaeffer
That could be, but I think he was just writing quickly and didn't think about var. It sounds like he understands what memory models are.Mezzo
@SLaks: I suspect there's some knowledge there, but maybe not enough. Unclear. Anyway, hopefully the last paragraph of my answer is appropriate.Schaeffer
Please ignore the var. I can't write without VS intellisense. Someone already deleted the confusing var. Thanks.Medan
L
4

Let me start by saying that I do not know the answer to your question, but I can help you simplify it down to its essence:

unsafe class C
{
    static int x;  // Assumed to be initialized to zero
    static int *p; // Assumed to be initialized to null
    static void M()
    {
        int* t = &C.x;
        *t = 1;
        C.p = t;
    }
    ...

Here int is standing in for the dictionary, p is standing in for your field that references a dictionary, t is the temporary created, and adding an element to the dictionary is modeled as mutating the value of field x. So the sequence of events here is: obtain storage for the dictionary and save that in a temporary, then mutate the thing referred to, and then publish the result.

The question is whether under the C# memory model, an observer on another thread is permitted to see that C.p is pointing to x and that x is still zero.

Like I said, I do not know for certain the answer to that; I would be interested to find out.

Off the top of my head though: why should that not be possible? p and x can be on completely different pages of memory. Suppose on some processor the value of x has been pre-fetched but p has not. Could that processor observe that p is not null but x is still zero? What's stopping that?

Levey answered 22/4, 2013 at 17:18 Comment(3)
Thanks. Yes, that's the essence of the question. However, I'd like the answer specifically for collection initializers i.e. the two of them might have different answers if the C# or .NET spec says that a memory barrier should be generated just for collection initializers (I can't find such thing so currently I assume it's not). So, is looking at the generated IL code enough (through e.g. ildasm) or can the JIT compiler apply different rules "just for collection initializers".Medan
@Palo: They don't - the JIT doesn't know anything about collection initializers, and the C# spce makes no collection-initializer-specific guarantees.Schaeffer
@Palo: The only guarantee that the C# specification makes about visibility of stuff that goes across threads is about special side effects like writing volatile fields or taking out locks or starting threads. Collection initializers don't come into it.Levey

© 2022 - 2024 — McMap. All rights reserved.