Why does LayoutKind.Sequential work differently if a struct contains a DateTime field?
Asked Answered
W

6

25

Why does LayoutKind.Sequential work differently if a struct contains a DateTime field?

Consider the following code (a console app which must be compiled with "unsafe" enabled):

using System;
using System.Runtime.InteropServices;

namespace ConsoleApplication3
{
    static class Program
    {
        static void Main()
        {
            Inner test = new Inner();

            unsafe
            {
                Console.WriteLine("Address of struct   = " + ((int)&test).ToString("X"));
                Console.WriteLine("Address of First    = " + ((int)&test.First).ToString("X"));
                Console.WriteLine("Address of NotFirst = " + ((int)&test.NotFirst).ToString("X"));
            }
        }
    }

    [StructLayout(LayoutKind.Sequential)]
    public struct Inner
    {
        public byte First;
        public double NotFirst;
        public DateTime WTF;
    }
}

Now if I run the code above, I get output similar to the following:

Address of struct = 40F2CC
Address of First = 40F2D4
Address of NotFirst = 40F2CC

Note that the address of First is NOT the same as the address of the struct; however, the address of NotFirst is the same as the address of the struct.

Now comment out the "DateTime WTF" field in the struct, and run it again. This time, I get output similar to this:

Address of struct = 15F2E0
Address of First = 15F2E0
Address of NotFirst = 15F2E8

Now "First" does have the same address as the struct.

I find this behaviour surprising given the use of LayoutKind.Sequential. Can anyone provide an explanation? Does this behaviour have any ramifications when doing interop with C/C++ structs that use the Com DATETIME type?

[EDIT] NOTE: I have verified that when you use Marshal.StructureToPtr() to marshal the struct, the data is marshalled in the correct order, with the "First" field being first. This seems to suggest that it will work fine with interop. The mystery is why the internal layout changes - but of course, the internal layout is never specified, so the compiler can do what it likes.

[EDIT2] Removed "unsafe" from struct declaration (it was leftover from some testing I was doing).

[EDIT3] The original source for this question was from the MSDN C# forums:

http://social.msdn.microsoft.com/Forums/en-US/csharplanguage/thread/fb84bf1d-d9b3-4e91-823e-988257504b30

Willard answered 9/11, 2010 at 10:14 Comment(10)
I guess you answered your own question ;)Stradivarius
Well thank goodness one never has to use DateTime when going unsafe. :)Foggy
+1 for answering your question. You should create an answer with your own answer and accept it when you can.Norris
I don't think it is valid to try to include a datetime because it contains string data internally. see social.msdn.microsoft.com/Forums/en/clr/thread/… for morePechora
@Kell: Static members do not affect the layout, and that is the only place with string is used.Foggy
Doh! yeah, you are right :) This is really weird behaviourPechora
@Kell: Also, the Marshalling stuff seems to handle DateTime specially, which might be related to this "interesting" behaviour. :) See msdn.microsoft.com/en-us/library/0t2cwe11%28VS.71%29.aspxWillard
Ah, the MS way of doing things: we are special and special rules apply to our classes :)Pechora
Maybe @Eric Lippert can explain this?Foggy
Two remarks: (1) It is actually redundant (but maybe informative?) to specify LayoutKind.Sequential since this is the default in C# for a struct type. (2) This "problem" may be related to the fact that DateTime itself has layout "Auto", since if you use TimeSpan (another struct of the same size) instead of DateTime as the type of your field WTF, the "problem" goes away.Finnish
F
20

Why does LayoutKind.Sequential work differently if a struct contains a DateTime field?

It is related to the (surprising) fact that DateTime itself has layout "Auto" (link to SO question by myself). This code reproduces the behavior you saw:

static class Program
{
    static unsafe void Main()
    {
        Console.WriteLine("64-bit: {0}", Environment.Is64BitProcess);
        Console.WriteLine("Layout of OneField: {0}", typeof(OneField).StructLayoutAttribute.Value);
        Console.WriteLine("Layout of Composite: {0}", typeof(Composite).StructLayoutAttribute.Value);
        Console.WriteLine("Size of Composite: {0}", sizeof(Composite));
        var local = default(Composite);
        Console.WriteLine("L: {0:X}", (long)(&(local.L)));
        Console.WriteLine("M: {0:X}", (long)(&(local.M)));
        Console.WriteLine("N: {0:X}", (long)(&(local.N)));
    }
}

[StructLayout(LayoutKind.Auto)]  // also try removing this attribute
struct OneField
{
    public long X;
}

struct Composite   // has layout Sequential
{
    public byte L;
    public double M;
    public OneField N;
}

Sample output:

64-bit: True
Layout of OneField: Auto
Layout of Composite: Sequential
Size of Composite: 24
L: 48F050
M: 48F048
N: 48F058

If we remove the attribute from OneField, things behave as expected. Example:

64-bit: True
Layout of OneField: Sequential
Layout of Composite: Sequential
Size of Composite: 24
L: 48F048
M: 48F050
N: 48F058

These example are with x64 platform compilation (so the size 24, three times eight, is unsurprising), but also with x86 we see the same "disordered" pointer addresses.

So I guess I can conclude that the layout of OneField (resp. DateTime in your example) has influence on the layout of the struct containing a OneField member even if that composite struct itself has layout Sequential. I am not sure if this is problematic (or even required).


According to comment by Hans Passant in the other thread, it no longer makes an attempt to keep it sequential when one of the members is an Auto layout struct.

Finnish answered 22/2, 2014 at 21:5 Comment(1)
Finally an answer to this question that actually makes sense.Taffy
I
7

Go read the specification for layout rules more carefully. Layout rules only govern the layout when the object is exposed in unmanaged memory. This means that the compiler is free to place the fields however it wants until the object is actually exported. Somewhat to my surprise, this is even true for FixedLayout!

Ian Ringrose is right about compiler efficiency issues, and that does account for the final layout that is being selected here, but it has nothing to do with why the compiler is ignoring your layout specification.

A couple of people have pointed out that DateTime has Auto layout. That is the ultimate source of your surprise, but the reason is a bit obscure. The documentation for Auto layout says that "objects defined with [Auto] layout cannot be exposed outside of managed code. Attempting to do so generates an exception." Also note that DateTime is a value type. By incorporating a value type having Auto layout into your structure, you inadvertently promised that you would never expose the containing structure to unmanaged code (because doing so would expose the DateTime, and that would generate an exception). Since the layout rules only govern objects in unmanaged memory, and your object can never be exposed to unmanaged memory, the compiler is not constrained in its choice of layout and is free to do whatever it wants. In this case it is reverting to the Auto layout policy in order to achieve better structure packing and alignment.

There! Wasn't that obvious!

All of this, by the way, is recognizable at static compile time. In fact, the compiler is recognizing it in order to decide that it can ignore your layout directive. Having recognized it, a warning here from the compiler would seem to be in order. You haven't actually done anything wrong, but it's helpful to be told when you've written something that has no effect.

The various comments here recommending Fixed layout are generally good advice, but in this case that wouldn't necessarily have any effect, because including the DateTime field exempted the compiler from honoring layout at all. Worse: the compiler isn't required to honor layout, but it is free to honor layout. Which means that successive versions of CLR are free to behave differently on this.

The treatment of layout, in my view, is a design flaw in CLI. When the user specifies a layout, the compiler shouldn't go lawyering around them. Better to keep things simple and have the compiler do what it is told. Especially so where layout is concerned. "Clever", as we all know, is a four letter word.

Infrangible answered 13/5, 2014 at 13:8 Comment(3)
Not a design flaw in CLI IMO. It's not supposed to change the layout of the managed structure at all - that's just a performance optimisation that allows the marshaller to avoid copying the structure in some cases. You're not setting the layout of the structure, you're instructing the marshaller to marshal the structure a certain way - and it is marshalled that way (in this case causing an exception). This is explicitly documented in msdn.microsoft.com/en-us/library/…Japhetic
The docs state about ExplicitLayout: Use the attribute with LayoutKind.Explicit to control the precise position of each data member. This affects both managed and unmanaged layout, for both blittable and non-blittable types. To me this means you do have control (as long as you don't use or include Auto).Dogy
Because of the discussion and confusion here, I reported the differences in coverage in the docs and specs here: github.com/dotnet/dotnet-api-docs/issues/4325Dogy
B
3

A few factors

  • doubles are a lot faster if they are aligned
  • CPU caches may work better if there are no “holes” in the struck

So the C# compiler has a few undocumented rules it uses to try to get the “best” layout of structs, these rules may take into account the total size of a struct, and/or if it contains another struct etc. If you need to know the layout of a struct then you should specify it yourself rather than letting the compiler decide.

However the LayoutKind.Sequential does stop the compiler changing the order of the fields.

Berwick answered 9/11, 2010 at 10:53 Comment(14)
So you just contradicted yourself?Foggy
@Leppie, no the docs for LayoutKind.Sequential say "... and can be noncontiguous"Berwick
What about the fields changing order? Check the addresses in play here, not just their spacing, but their values/order as well.Olympie
It doesn't seem to be anything to do with the packing, since specifying Pack=1 doesn't change the ordering.Willard
That is to do with packing, not ordering.Foggy
Another possible factor: typeof(DateTime).IsAutoLayout holds, and that is unusual. If you change to TimeSpan, say, since typeof(TimeSpan).IsLayoutSequential, you won't have that "anomaly".Finnish
There is an example in the BCL, namely TimeZoneInfo.TransitionTime (a.k.a. System.TimeZoneInfo+TransitionTime). Its fields, in order of declaration, and the other they are found with reflection, are: DateTime m_timeOfDay; byte m_month; byte m_week; byte m_day; DayOfWeek m_dayOfWeek; bool m_isFixedDateRule;. But when seen with unsafe pointers, they come in the order |dayOfWeek|month|week|day|isFixedDateRule|timeOfDay|. Those are sizes 4+1+1+1+1+8. This struct has layout sequential, but its fields are not in order.Finnish
@JeppeStigNielsen StructLayout specifies the way the structure is marshalled, not its managed layout - that's just an implementation detail (a performance optimisation that allows the marshaller to avoid copying the structure in some interop scenarios). Taking pointers to fields in a struct is dealing with the managed layout, which isn't constrained by contract at all. Use Marshal.StructureToPtr and you'll see the "proper" sequential layout in the unmanaged structure.Japhetic
@Japhetic Very interesting comment. You are right Marshal.StructureToPtr is "better" here. The reason why I took a pointer to a managed layout, I think, was because that was what the asker did in his question (&test etc.). Maybe I will add your comment to my own answer elsewhere on this page.Finnish
@luaan, except for ExplicitLayout, of which the msdn docs clearly say it applies to managed and unmanaged code, blittable or notDogy
@Dogy Nope learn.microsoft.com/en-us/dotnet/api/…. CLR spec says the same thing. It is only guaranteed to change the unmanaged layout of the structure. It may change the managed layout as well, but it's not part of the contract - you shouldn't rely on it.Japhetic
@luaan, the text here is explicit though: "Use the attribute with LayoutKind.Explicit to control the precise position of each data member. This affects both managed and unmanaged layout, for both blittable and non-blittable types." learn.microsoft.com/en-us/dotnet/api/…. And I've used that in managed code for overlapping fields, so I know it works.Dogy
@Dogy I also know it works. That doesn't mean it's part of the contract. Overlapping fields are not supported either. That doesn't mean they don't work; it just means you can't rely on them working.Japhetic
@Luaan, checking ECMA-335, II.22.8, says "The ClassLayout table is used to define how the fields of a class or value type shall be laid out by the CLI. (Normally, the CLI is free to reorder and/or insert gaps between the fields defined for a class or value type.)". Note the "shall" in this sentence. It mostly matches the docs (but other refs back up your claim). I doubt current behavior will ever be changed as that requires a spec change. The precise rules are in the spec as quoted and in the following sections. It also explains why Auto breaks layout-explicitness.Dogy
W
3

To answer my own questions (as advised):

Question: "Does this behaviour have any ramifications when doing interop with C/C++ structs that use the Com DATETIME type?"

Answer: No, because the layout is respected when using Marshalling. (I verified this empirically.)

Question "Can anyone provide an explanation?".

Answer: I'm still not sure about this, but since the internal representation of a struct is not defined, the compiler can do what it likes.

Willard answered 9/11, 2010 at 14:28 Comment(1)
With the exception of ExplicitLayout, which applies to managed and unmanaged, blittable and non-blittable types.Dogy
L
2

You're checking the addresses as they are within the managed structure. Marshal attributes have no guarantees for the arrangement of fields within managed structures.

The reason it marshals correctly into native structures, is because the data is copied into native memory using the attributes set by marshal values.

So, the arrangement of the managed structure has no impact on the arranged of the native structure. Only the attributes affect the arrangement of native structure.

If fields setup with marshal attributes were stored in managed data the same way as native data, then there would be no point in Marshal.StructureToPtr, you'd simply byte-copy the data over.

Lathe answered 8/11, 2011 at 20:57 Comment(1)
Note that the compiler is free to change the managed layout to be the same as the marshalled structure - it's a valid performance optimisation that allows the marshaller to avoid copying the structure in some interop scenarios. It's not a contractual behaviour, though - you should never make any assumptions about the managed layout of structures, regardless of StructLayout, FieldOffset etc.Japhetic
P
1

If you're going to interop with C/C++, I would always be specific with the StructLayout. Instead of Sequential, I would go with Explicit, and specify each position with FieldOffset. In addition, add your Pack variable.

[StructLayout(LayoutKind.Explicit, Pack=1, CharSet=CharSet.Unicode)]
public struct Inner
{
    [FieldOffset(0)]
    public byte First;
    [FieldOffset(1)]
    public double NotFirst;
    [FieldOffset(9)]
    public DateTime WTF;
}

It sounds like DateTime can't be Marshaled anyhow, only to a string (bingle Marshal DateTime).

The Pack variable is especially important in C++ code that might be compiled on different systems that have different word sizes.

I would also ignore the addresses that can be seen when using unsafe code. It doesn't really matter what the compiler does as long as the Marshaling is correct.

Pillar answered 14/2, 2011 at 21:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.