What is the size of a boolean In C#? Does it really take 4-bytes?
Asked Answered
S

3

158

I have two structs with arrays of bytes and booleans:

using System.Runtime.InteropServices;

[StructLayout(LayoutKind.Sequential, Pack = 4)]
struct struct1
{
    [MarshalAs(UnmanagedType.ByValArray, SizeConst = 3)]
    public byte[] values;
}

[StructLayout(LayoutKind.Sequential, Pack = 4)]
struct struct2
{
    [MarshalAs(UnmanagedType.ByValArray, SizeConst = 3)]
    public bool[] values;
}

And the following code:

class main
{
    public static void Main()
    {
        Console.WriteLine("sizeof array of bytes: "+Marshal.SizeOf(typeof(struct1)));
        Console.WriteLine("sizeof array of bools: " + Marshal.SizeOf(typeof(struct2)));
        Console.ReadKey();
    }
}

That gives me the following output:

sizeof array of bytes: 3
sizeof array of bools: 12

It seems to be that a boolean takes 4 bytes of storage. Ideally a boolean would only take one bit (false or true, 0 or 1, etc..).

What is happening here? Is the boolean type really so inefficient?

Sashenka answered 14/2, 2015 at 9:52 Comment(3)
This is one of the most ironic clashes in the ongoing battle of hold-reasons: Two excellent answers by John and Hans just made it, even though answers to this question will tend to be almost entirely based on opinions, rather than facts, references, or specific expertise.Swedish
@TaW: My guess is that the close votes were not due to the answers but the OP's original tone when they first put forth the question - they clearly intended to start a fight and outright showed this in the now-deleted comments. Most of the cruft has been swept under the rug, but check out the revision history to get a glimpse of what I mean.Lakeshialakey
Why not using a BitArray ?My
E
291

The bool type has a checkered history with many incompatible choices between language runtimes. This started with an historical design-choice made by Dennis Ritchie, the guy that invented the C language. It did not have a bool type, the alternative was int where a value of 0 represents false and any other value was considered true.

This choice was carried forward in the Winapi, the primary reason to use pinvoke, it has a typedef for BOOL which is an alias for the C compiler's int keyword. If you don't apply an explicit [MarshalAs] attribute then a C# bool is converted to a BOOL, thus producing a field that is 4 bytes long.

Whatever you do, your struct declaration needs to be a match with the runtime choice made in the language you interop with. As noted, BOOL for the winapi but most C++ implementations chose byte, most COM Automation interop uses VARIANT_BOOL which is a short.

The actual size of a C# bool is one byte. A strong design-goal of the CLR is that you cannot find out. Layout is an implementation detail that depends on the processor too much. Processors are very picky about variable types and alignment, wrong choices can significantly affect performance and cause runtime errors. By making the layout undiscoverable, .NET can provide a universal type system that does not depend on the actual runtime implementation.

In other words, you always have to marshal a structure at runtime to nail down the layout. At which time the conversion from the internal layout to the interop layout is made. That can be very fast if the layout is identical, slow when fields need to be re-arranged since that always requires creating a copy of the struct. The technical term for this is blittable, passing a blittable struct to native code is fast because the pinvoke marshaller can simply pass a pointer.

Performance is also the core reason why a bool is not a single bit. There are few processors that make a bit directly addressable, the smallest unit is a byte. An extra instruction is required to fish the bit out of the byte, that doesn't come for free. And it is never atomic.

The C# compiler isn't otherwise shy about telling you that it takes 1 byte, use sizeof(bool). This is still not a fantastic predictor for how many bytes a field takes at runtime, the CLR also needs to implement the .NET memory model and it promises that simple variable updates are atomic. That requires variables to be properly aligned in memory so the processor can update it with a single memory-bus cycle. Pretty often, a bool actually requires 4 or 8 bytes in memory because of this. Extra padding that was added to ensure that the next member is aligned properly.

The CLR actually takes advantage of layout being undiscoverable, it can optimize the layout of a class and re-arrange the fields so the padding is minimized. So, say, if you have a class with a bool + int + bool member then it would take 1 + (3) + 4 + 1 + (3) bytes of memory, (3) is the padding, for a total of 12 bytes. 50% waste. Automatic layout rearranges to 1 + 1 + (2) + 4 = 8 bytes. Only a class has automatic layout, structs have sequential layout by default.

More bleakly, a bool can require as many as 32 bytes in a C++ program compiled with a modern C++ compiler that supports the AVX instruction set. Which imposes a 32-byte alignment requirement, the bool variable may end up with 31 bytes of padding. Also the core reason why a .NET jitter does not emit SIMD instructions, unless explicitly wrapped, it can't get the alignment guarantee.

Elam answered 14/2, 2015 at 11:52 Comment(5)
Re: SIMD blogs.msdn.com/b/dotnet/archive/2014/04/07/…Nebo
For an interested but uninformed reader, would you clarify whether the last paragraph should really read 32 bytes and not bits?Amateur
Not sure why I just read all this (as I don't need this much details) but that is fascinating and well written.Radloff
@Silly - it is bytes. AVX uses 512 bit variables to do math on 8 floating point values with a single instruction. Such a 512 bit variable requires alignment to 32.Elam
Wow! one post gave a hell lot of topics to understand. That's why I like just reading top questions.Edible
E
163

Firstly, this is only the size for interop. It doesn't represent the size in managed code of the array. That's 1 byte per bool - at least on my machine. You can test it for yourself with this code:

using System;
class Program 
{ 
    static void Main(string[] args) 
    { 
        int size = 10000000;
        object array = null;
        long before = GC.GetTotalMemory(true); 
        array = new bool[size];
        long after = GC.GetTotalMemory(true); 

        double diff = after - before; 

        Console.WriteLine("Per value: " + diff / size);

        // Stop the GC from messing up our measurements 
        GC.KeepAlive(array); 
    } 
}

Now, for marshalling arrays by value, as you are, the documentation says:

When the MarshalAsAttribute.Value property is set to ByValArray, the SizeConst field must be set to indicate the number of elements in the array. The ArraySubType field can optionally contain the UnmanagedType of the array elements when it is necessary to differentiate among string types. You can use this UnmanagedType only on an array that whose elements appear as fields in a structure.

So we look at ArraySubType, and that has documentation of:

You can set this parameter to a value from the UnmanagedType enumeration to specify the type of the array's elements. If a type is not specified, the default unmanaged type corresponding to the managed array's element type is used.

Now looking at UnmanagedType, there's:

Bool
A 4-byte Boolean value (true != 0, false = 0). This is the Win32 BOOL type.

So that's the default for bool, and it's 4 bytes because that corresponds to the Win32 BOOL type - so if you're interoperating with code expecting a BOOL array, it does exactly what you want.

Now you can specify the ArraySubType as I1 instead, which is documented as:

A 1-byte signed integer. You can use this member to transform a Boolean value into a 1-byte, C-style bool (true = 1, false = 0).

So if the code you're interoperating with expects 1 byte per value, just use:

[MarshalAs(UnmanagedType.ByValArray, SizeConst = 3, ArraySubType = UnmanagedType.I1)]
public bool[] values;

Your code will then show that as taking up 1 byte per value, as expected.

Evvy answered 14/2, 2015 at 10:1 Comment(2)
What purpose does GC.KeepAlive serve in the above example? While I do see the comment, I don't understand it. At that point, the measurement has already been made and printed to the console... Commenting the statement out doesn't seem to make a difference.Camargo
@MarkSeemann: With the statement commented out, the array could be collected immediately after allocation, before long after = GC.GetTotalMemory(true); is executed. The array isn't used after allocation - GC.KeepAlive just prevents it from being collected.Evvy
P
4

The other answers are obviously correct, that a bool is 1 byte. This adds a working sample showing that a bool truly reads and writes exactly one byte of memory, no more, no less.

using System;
using System.Runtime.InteropServices;

public class Program
{
    [StructLayout(LayoutKind.Explicit)]
    struct BoolIntUnion
    {
        [FieldOffset(0)]
        public UInt32 i;

        [FieldOffset(0)]
        public bool b;
    }

    public static void Main()
    {
        var u = new BoolIntUnion();

        //first let's see how many bits a boolean reads from memory
        //we will do this by reading/writing an Int32 and a boolean to the same place in memory and observe the results

        //if a bool is only 8 bits, then only the first 8 bits of a UInt32 will make the boolean become true
        u.i = 0b00000000_00000000_00000000_00000001; //try bit 1
        if (u.b) Console.WriteLine("True for " + u.i);
        u.i = 0b00000000_00000000_00000000_10000000; //try bit 8
        if (u.b) Console.WriteLine("True for " + u.i);

        //now set all bits on except for the first 8, the boolean should be false if it only accesses the first 8 bits
        u.i = 0b11111111_11111111_11111111_00000000;
        if (!u.b) Console.WriteLine("False for " + u.i);

        //now let's go the other way and see how many bits a boolean writes to memory
        u.i = 0b11111111_11111111_11111111_11111111;
        u.b = false; //overlay a boolean "false" on top of an UInt32 that has all the bits turned on
        if (u.i == 0b11111111_11111111_11111111_00000000) Console.WriteLine("Overlaying a bool on top of a UInt32 cleared only the first 8 bits");
    }
}
Paraphernalia answered 26/10, 2022 at 21:3 Comment(1)
When marshaling a struct in C# bool gets converted to a 4 byte value unlike other typesNarayan

© 2022 - 2024 — McMap. All rights reserved.