Why Enum's HasFlag method need boxing?
Asked Answered
G

9

15

I am reading "C# via CLR" and on page 380, there's a note saying the following:

Note The Enum class defines a HasFlag method defined as follows

public Boolean HasFlag(Enum flag);

Using this method, you could rewrite the call to Console.WriteLine like this:

Console.WriteLine("Is {0} hidden? {1}", file, attributes.HasFlag(FileAttributes.Hidden));

However, I recommend that you avoid the HasFlag method for this reason:

Since it takes a parameter of type Enum, any value you pass to it must be boxed, requiring a memory allocation ."

I can not understand this bolded statement -- why "

any value you pass to it must be boxed

The flag parameter type is Enum, which is a value type, why would there be boxing? The "any value you pass to it must be boxed" should mean boxing happens when you pass value type to parameter Enum flag, right?

Gestation answered 26/7, 2012 at 8:27 Comment(3)
It all comes down to a single, but confusing, statement: Enum is not an enum...Guacharo
@MarcGravell Indeed, I've spent a long chain of comments trying to defend my answer from the fact that people refuse to believe that statement. Confused by: ValueType is not a value type lol...Esculent
Note that as of .NET Core 2.1, Enum.HasFlag doesn't box I believe: blogs.msdn.microsoft.com/dotnet/2018/04/18/…. While I could see box instruction in IL still in 2.1 app, it doesn't allocate, hence I dont see the perf penalty.Garrison
E
9

In this instance, two boxing calls are required before you even get into the HasFlags method. One is for resolving the method call on the value type to the base type method, the other is passing the value type as a reference type parameter. You can see the same in IL if you do var type = 1.GetType();, the literal int 1 is boxed before the GetType() call. The boxing on method call seems to be only when methods are not overridden in the value type definition itself, more can be read here: Does calling a method on a value type result in boxing in .NET?

The HasFlags takes an Enum class argument, so the boxing will occur here. You are trying to pass what is a value type into something expecting a reference type. To represent values as references, boxing occurs.

There is lots of compiler support for value types and their inheritance (with Enum / ValueType) that confuses the situation when trying to explain it. People seem to think that because Enum and ValueType is in the inheritance chain of value types boxing suddenly doesn't apply. If this were true, the same could be said of object as everything inherits that - but as we know this is false.

This doesn't stop the fact that representing a value type as a reference type will incur boxing.

And we can prove this in IL (look for the box codes):

class Program
{
    static void Main(string[] args)
    {
        var f = Fruit.Apple;
        var result = f.HasFlag(Fruit.Apple);

        Console.ReadLine();
    }
}

[Flags]
enum Fruit
{
    Apple
}



.method private hidebysig static 
    void Main (
        string[] args
    ) cil managed 
{
    // Method begins at RVA 0x2050
    // Code size 28 (0x1c)
    .maxstack 2
    .entrypoint
    .locals init (
        [0] valuetype ConsoleApplication1.Fruit f,
        [1] bool result
    )

    IL_0000: nop
    IL_0001: ldc.i4.0
    IL_0002: stloc.0
    IL_0003: ldloc.0
    IL_0004: box ConsoleApplication1.Fruit
    IL_0009: ldc.i4.0
    IL_000a: box ConsoleApplication1.Fruit
    IL_000f: call instance bool [mscorlib]System.Enum::HasFlag(class [mscorlib]System.Enum)
    IL_0014: stloc.1
    IL_0015: call string [mscorlib]System.Console::ReadLine()
    IL_001a: pop
    IL_001b: ret
} // end of method Program::Main

The same can be seen when representing a value type as ValueType, it also results in boxing:

class Program
{
    static void Main(string[] args)
    {
        int i = 1;
        ValueType v = i;

        Console.ReadLine();
    }
}


.method private hidebysig static 
    void Main (
        string[] args
    ) cil managed 
{
    // Method begins at RVA 0x2050
    // Code size 17 (0x11)
    .maxstack 1
    .entrypoint
    .locals init (
        [0] int32 i,
        [1] class [mscorlib]System.ValueType v
    )

    IL_0000: nop
    IL_0001: ldc.i4.1
    IL_0002: stloc.0
    IL_0003: ldloc.0
    IL_0004: box [mscorlib]System.Int32
    IL_0009: stloc.1
    IL_000a: call string [mscorlib]System.Console::ReadLine()
    IL_000f: pop
    IL_0010: ret
} // end of method Program::Main
Esculent answered 26/7, 2012 at 8:38 Comment(16)
Yes, you are right, but in this case the boxing in occurring due to the call the Console.WriteLine. Jeff Richter has a large section in the book about avoiding boxing, and I believe this is where it comes from.Frondescence
@Frondescence So why are there boxing IL commands in my sample?Esculent
This isn't answer, this just repeats the observation leading to this question.Organology
@hvd It explains why there is boxing, Enum is a class...Esculent
@AdamHouldsworth ...which derives from ValueType.Organology
@hvd The question doesn't care about that. Though the mechanics of the true answer might. The ValueType base class is handled specially by the compiler, I've no idea what handling enum the keyword gets.Esculent
@AdamHouldsworth From the question: "flag parament type is Enum, which is value type". You're basically saying Enum derives from ValueType, but isn't a value type. You may be right, but that could use some more highlighting.Organology
Actually just read the section in the book, and he is indeed referring to the call to the Enum.HasFlag method. My mistake.Frondescence
@hvd Yes the documentation for Enum, enum, and ValueType state the inheritance chains but offer nothing as to how this is realised by the compiler.Esculent
I think adam should be right. I confuse Enum type and enum typeGestation
Lowercase enum isn't a type--it's a keyword. (Try typeof(enum) vs typeof(Enum) vs typeof(int) if you need to convince yourself.)Destined
This answer seriously isn't the answer. It's more information on the path to the answer, but we're not there yet. -1 (for now?)Destined
@Destined -1 for correct information? My final statement is my answer. The remaining question is why? Which isn't being asked. int inherits object, so would you expect boxing to occur there?Esculent
@AdamHouldsworth, There is tons of misinformation here. int and Enum have the same type hierarchy. Try Console.WriteLine(typeof(int).BaseType == typeof(Enum).BaseType && typeof(int).BaseType == typeof(ValueType)); if you need proof.Destined
@Destined The misinformation is only in people implying that inheriting from Enum or ValueType means that representing these values types in either Enum or ValueType shouldn't incur boxing... simply, it does. The same argument could be made for object. The answer is truly as simple as it can get: it boxes because you are representing a value type as a reference type.Esculent
@AdamHouldsworth: It's too bad that Enum wasn't implemented as a struct containing an Int64 and a Type. Such a design might not have nicely supported Enum types backed by a Uint64 [how often are those used?] but could otherwise have allowed easy non-boxing widening conversions to Enum from any enumerated type, and narrowing conversions from Enum to any enumerated type (the enumerated types themselves would only need to be 1, 2, 4, or 8-byte type, since each Enum type would know what its own type was).Synclastic
S
9

It's worth noting that a generic HasFlag<T>(T thing, T flags) which is about 30 times faster than the Enum.HasFlag extension method can be written in about 30 lines of code. It can even be made into an extension method. Unfortunately, it's not possible in C# to restrict such a method to only take things of enumerated types; consequently, Intellisense will pop up the method even for types for which it is not applicable. I think if one used some language other than C# or vb.net to write the extension method it might be possible to make it pop up only when it should, but I'm not familiar enough with other languages to try such a thing.

internal static class EnumHelper<T1>
{
    public static Func<T1, T1, bool> TestOverlapProc = initProc;
    public static bool Overlaps(SByte p1, SByte p2) { return (p1 & p2) != 0; }
    public static bool Overlaps(Byte p1, Byte p2) { return (p1 & p2) != 0; }
    public static bool Overlaps(Int16 p1, Int16 p2) { return (p1 & p2) != 0; }
    public static bool Overlaps(UInt16 p1, UInt16 p2) { return (p1 & p2) != 0; }
    public static bool Overlaps(Int32 p1, Int32 p2) { return (p1 & p2) != 0; }
    public static bool Overlaps(UInt32 p1, UInt32 p2) { return (p1 & p2) != 0; }
    public static bool Overlaps(Int64 p1, Int64 p2) { return (p1 & p2) != 0; }
    public static bool Overlaps(UInt64 p1, UInt64 p2) { return (p1 & p2) != 0; }
    public static bool initProc(T1 p1, T1 p2)
    {
        Type typ1 = typeof(T1);
        if (typ1.IsEnum) typ1 = Enum.GetUnderlyingType(typ1);
        Type[] types = { typ1, typ1 };
        var method = typeof(EnumHelper<T1>).GetMethod("Overlaps", types);
        if (method == null) method = typeof(T1).GetMethod("Overlaps", types);
        if (method == null) throw new MissingMethodException("Unknown type of enum");
        TestOverlapProc = (Func<T1, T1, bool>)Delegate.CreateDelegate(typeof(Func<T1, T1, bool>), method);
        return TestOverlapProc(p1, p2);
    }
}
static class EnumHelper
{
    public static bool Overlaps<T>(this T p1, T p2) where T : struct
    {
        return EnumHelper<T>.TestOverlapProc(p1, p2);
    }
}

EDIT: A previous version was broken, because it used (or at least tried to use) EnumHelper<T1, T1>.

Synclastic answered 18/12, 2012 at 17:26 Comment(12)
That's some magic! going to steal it :) I think you should definitely make it an answer here: https://mcmap.net/q/22285/-hasflag-with-a-generic-enumGarrison
Caching method will make it even faster.Questionless
You can apply the System.Enum restrain in C# 7.3!Nodose
@IvanGarcíaTopete If you're stuck with an older version as I am, you can use struct as a constraint and with that at least prevent Intellisense from offering the extension method for basically anything.Misconstrue
@supercat: Despite being allocation free, your solution is twice as slow as the native HasFlag in my tests (calling each 10,000,000 times in a loop in a unit test). Could it be that calling the delegate outweighs the costs of boxing? Or is my unit test misleading?Misconstrue
@Synclastic Why the second check, if (method == null) method = typeof(T1).GetMethod("Overlaps", types);?Misconstrue
@mike: In 2012, the HasFlag method would generally box not only the object upon which it was invoked, but also the the object it was being tested against. Additionally, each invocation would need to use Reflection to determine how the values were represented. My method is much slower than the ideal machine code to test flags, but was faster than the way HasFlag actually worked. It may well be that in newer versions of C# and .NET the performance of HasFlag has been improved to be better than my version, but I've not used .NET lately except for maintaining existing projects.Synclastic
@mike: When the code was written, System.Enum was not available as a generic constraint, and the extension method could be used on a reference of type System.Enum or System.Object. If invocation with a generic type parameter of System.Object were to and cached a lookup method for System.Object, then an attempt to later use the method on a different kind of object would use an inappropriate cached method.Synclastic
@Synclastic I think are mistakenly not caching the delegate. That explains why @Misconstrue has reported your solution as slower than the native GetFlag. At the time I'm writing this: TestOverlapProc = initProc makes the former simply a reference to the latter. The latter is a method that, whenever invoked, builds a delegate (a costly operation that should not be repeated), runs it, and returns the result. Instead, you could build the delegate once and store it. I'm guessing this may have been your intent.Camarata
@Camarata The first call to TestOverlapProc will call initProc, which creates the correct Overlaps delegate to use, and then updates TestOverlapProc to point to Overlaps, and finally calls it. So the 2nd time you call TestOverlapProc it will directly call the right Overlaps method.Goulden
O wow, that is obscure! TestOverlapProc gets initialized to point to initProc, which, on first invocation, creates the delegate, overwrites TestOverlapProc, and invokes it to get the result (performing multiple responsibilities and belying its name). Subsequent invocations then lead to the new value, i.e. the compiled delegate. That is horribly cryptic. A static initializer or simple ??= would have avoided the confusion.Camarata
@Timo: The code is intended to minimize runtime overhead for all but the first invocation of Overlaps for any particular type. The pattern used here isn't universally known, but it's widely used in many kinds of lazy code creation or loading such as e.g. 8087 floating-point emulation. Initially, a compiler will place an INT instruction where a floating-point instruction would go. The interrupt handler will then, if an FPU is present, replace the instruction with the appropriate FPU instruction, subtract 2 from the return address, and resume program execution.Synclastic
B
4

Enum inherits from ValueType which is... a class! Hence the boxing.

Note that the Enum class can represents any enumeration, whatever its underlying type is, as a boxed value. Whereas a value such as FileAttributes.Hidden will be represented as real value type, int.

Edit: let's differentiate the type and the representation here. An int is represented in memory as 32 bits. Its type derives from ValueType. As soon as you assign an int to an object or derived class (ValueType class, Enum class), you're boxing it, effectively changing its representation to a class now containing that 32 bits, plus additional class information.

Brittanybritte answered 26/7, 2012 at 8:31 Comment(7)
I don't understand your first sentence. void f(int i) { } void g() { f(3); } -- int also inherits from ValueType, but there is no boxing there. Same if int is changed to a concrete enum type.Organology
That can't be the whole story. System.Int32 inherits from ValueType, too.Destined
Yes, the method here takes an int, not object.Frondescence
@JulienLebosquain Yes, it does. All structs inherit from ValueType.Organology
@JulienLebosquain See the struct page on msdn msdn.microsoft.com/en-us/library/saxz13w4.aspx :: A struct cannot inherit from another struct or class, and it cannot be the base of a class. All structs inherit directly from System.ValueType, which inherits from System.Object.Ruttger
Edited my post, trying to explain the difference between a Type and its value representation.Brittanybritte
@hvd: Storage locations of types which inherit from System.ValueType or System.Enum, *but are not of those two exact types*, are value types. Storage locations of all other types that derive from Object` are class types; that includes System.ValueType and System.Enum themselves.Synclastic
O
4

Since C# 7.3, where generic Enum constraint was introduced, you can write a fast, non allocating version that doesn't rely on reflection. It requires the compiler flag /unsafe but since Enum backing types can only be a fixed amount of sizes, it should be perfectly safe to do:

using System;
using System.Runtime.CompilerServices;
public static class EnumFlagExtensions
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static bool HasFlagUnsafe<TEnum>(TEnum lhs, TEnum rhs) where TEnum : unmanaged, Enum
    {
        unsafe
        {
            switch (sizeof(TEnum))
            {
                case 1:
                    return (*(byte*)(&lhs) & *(byte*)(&rhs)) > 0;
                case 2:
                    return (*(ushort*)(&lhs) & *(ushort*)(&rhs)) > 0;
                case 4:
                    return (*(uint*)(&lhs) & *(uint*)(&rhs)) > 0;
                case 8:
                    return (*(ulong*)(&lhs) & *(ulong*)(&rhs)) > 0;
                default:
                    throw new Exception("Size does not match a known Enum backing type.");
            }
        }
    }
}
Osmious answered 28/6, 2021 at 8:28 Comment(1)
Unsafe.As<TFrom, TTo>() should allow for an implementation that does not require the /unsafe compiler flag. E.g. Unsafe.As<TEnum, byte>(ref lhs). (The same goes for MemoryMarshal.Cast<TFrom, TTo>(), although we would first have to turn the value into a span using MemoryMarshal.CreateSpan<T>().)Camarata
C
1

As suggested by Timo the solution of Martin Tilo Schmitz can be implemented without the need for the /unsafe switch:

public static bool HasAnyFlag<E>(this E lhs, E rhs) where E : unmanaged, Enum
{
    switch (Unsafe.SizeOf<E>())
    {
    case 1:
        return (Unsafe.As<E, byte>(ref lhs) & Unsafe.As<E, byte>(ref rhs)) != 0;
    case 2:
        return (Unsafe.As<E, ushort>(ref lhs) & Unsafe.As<E, ushort>(ref rhs)) != 0;
    case 4:
        return (Unsafe.As<E, uint>(ref lhs) & Unsafe.As<E, uint>(ref rhs)) != 0;
    case 8:
        return (Unsafe.As<E, ulong>(ref lhs) & Unsafe.As<E, ulong>(ref rhs)) != 0;
    default:
        throw new Exception("Size does not match a known Enum backing type.");
    }
}

The NuGet System.Runtime.CompilerServices.Unsafe is required to compile this with .NET Framework.

Performance

  • I have to mention that any generic implementation is still almost an order of magnitude slower than a native check, i.e. ((int)lhs & (int)rhs) != 0. I guess taking the reference of lhs, rhs prevents optimization of the storage of the function variables. The runtime dispatch of the enum size adds another overhead.
  • But it is still one order of magnitude faster than HasFlag.
  • And well, the performance benefit is almost zero comapred to HasFlag if optimizations are turned off in a debug build.
  • There is no significant difference between using unsafe { } and using class Unsafe in optimized (release) builds. Only without optimizations class Unsafe is almost as slow as HasFlag.
  • Using a delegate as supercat recommended is no reasonable option since the resulting function pointer call is slow on most CPU architectures and even more it completely prevents inlining.
  • MethodImplOptions.AggressiveInlining adds no value.

Conclusion

There is still no really fast and readable implementation for testing flags in enums.

Contempt answered 8/2, 2022 at 17:29 Comment(1)
i would like to add: using tests to evaluate the real-world performance impact is often hard (if not impossible) when it comes to heap allocation and garbage collection. the safest way (albeit not prettiest) is sadly still a & b != 0...Misconstrue
F
0

When ever you pass a value type of a method that takes object as a parameter, as in the case of console.writeline, there will be an inherent boxing operation. Jeffery Richter discusses this in detail in the same book you mention.

In this case you are using the string.format method of console.writeline, and that takes a params array of object[]. So your bool, will be cast to object, so hence you get a boxing operation. You can avoid this by calling .ToString() on the bool.

Frondescence answered 26/7, 2012 at 8:34 Comment(5)
Enum.HasFlag's parameter type isn't object.Organology
That's boxing the bool result of Enum.HasFlag(), not the argument to HasFlag()...Destined
Enum.HasFlag() returns bool, a value type, and so, boxing.Frondescence
@Frondescence That isn't what this question is about, and besides, returning a value type doesn't require boxing unless the function is declared as returning a reference type (object).Organology
but here "any value you pass to it must be boxed" should mean boxing happens when you pass value type to parameter "Enum flag", right?Gestation
E
0

Moreover, there's more than single boxing in Enum.HasFlag:

public bool HasFlag(Enum flag)
{
    if (!base.GetType().IsEquivalentTo(flag.GetType()))
    {
        throw new ArgumentException(Environment.GetResourceString("Argument_EnumTypeDoesNotMatch", new object[]
        {
            flag.GetType(),
            base.GetType()
        }));
    }
    ulong num = Enum.ToUInt64(flag.GetValue());
    ulong num2 = Enum.ToUInt64(this.GetValue());
    return (num2 & num) == num;
}

Look at GetValue method calls.

Update. Looks like MS had optimized this method in .NET 4.5 (the source code has been downloaded from referencesource):

    [System.Security.SecuritySafeCritical]
    public Boolean HasFlag(Enum flag) { 
        if (flag == null)
            throw new ArgumentNullException("flag"); 
        Contract.EndContractBlock(); 

        if (!this.GetType().IsEquivalentTo(flag.GetType())) { 
            throw new ArgumentException(Environment.GetResourceString("Argument_EnumTypeDoesNotMatch", flag.GetType(), this.GetType()));
        }

        return InternalHasFlag(flag); 
    }

    [System.Security.SecurityCritical]  // auto-generated 
    [ResourceExposure(ResourceScope.None)]
    [MethodImplAttribute(MethodImplOptions.InternalCall)] 
    private extern bool InternalHasFlag(Enum flags);
Elah answered 26/7, 2012 at 8:46 Comment(4)
Is that the actual implementation? Seems needlessly slow. It's possible, and not overly hard, to write a static method bool HasFlag<T>(T p1, T p2) which will run about more than 10x as fast as enum.HasFlag.Synclastic
@supercat: excellent question, indeed. It is actual for .NET 4.0, for .NET 4.5 it is different. See updated answer.Elah
I wonder how that affects performance? Using .net 4.0, my generic method seems about 6x slower than using & but 30x faster than Enum.HasFlag. Actually, I would think that testing an enumeration for flag values would be a sufficiently frequent operation that would be worthy of language support, especially since a language could separate out HasAny, HasAll, and Has cases, restricting the latter to operands that were constant powers of two [since it's otherwise ambiguous whether SomeEnum.Has(3) should mean (SomeEnum & 3) != 0 or (SomeEnum & 3)==3.]Synclastic
See my answer for the my code, if you'd like to benchmark it against the .net 4.5 version of HasFlag. (Incidentally, as shown, my version will work with any struct type T that defines an Overlaps(T,T) overload; not sure that's useful, but it shouldn't affect the benchmarks since each type is only evaluated once).Synclastic
D
0

There are two boxing operations involved in this call, not just one. And both are required for one simple reason: Enum.HasFlag() needs type information, not just values, for both this and flag.

Most of the time, an enum value truly is just a set of bits and the compiler has all the type information it needs from the enum types represented in the method signature.

However, in the case of Enum.HasFlags() the very first thing it does is call this.GetType() and flag.GetType() and make sure they're identical. If you wanted the typeless version, you'd be asking if ((attribute & flag) != 0), instead of calling Enum.HasFlags().

Destined answered 26/7, 2012 at 9:14 Comment(3)
Some time passed but anyway: this is not true. If you ask for GetType() inside some method then boxing will occur just inside this method. You could easily test it with simple value type with some method which call GetType() for example. You'll see that boxing will occur inside your method, not when you calling it from some outer code.Imprint
@iw.kuchin: The type System.Enum is a class type, as is the ironically-named System.ValueType. Calling Enum.HasFlag(Enum) requires casting its argument to System.Enum, which means it will be boxed before the HasFlag method gets a chance to execute.Synclastic
@Synclastic Yes, and this is the real reason because boxing occurs here. Not because type information is needed somewhere inside Enum.HasFlag().Imprint
W
-2

The type is the abstract base class of all enum types (this is distinct and different from the underlying type of the enum type), and the members inherited from are available in any enum type. A boxing conversion (§10.2.9) exists from any enum type to , and an unboxing conversion (§10.3.7) exists from to any enum type.

Note that is not itself an enum_type. Rather, it is a class_type from which all enum_types are derived. The type inherits from the type (§8.3.2), which, in turn, inherits from type . At run-time, a value of type can be or a reference to a boxed value of any enum type.

https://learn.microsoft.com/zh-cn/dotnet/csharp/language-reference/language-specification/enums#the-systemenum-type

Wow answered 18/9, 2023 at 11:55 Comment(1)
Please do not post multilingual answers, they tend to grow apart. Only English is accepted language here.Extraterrestrial

© 2022 - 2024 — McMap. All rights reserved.