Why does the c# compiler emit Activator.CreateInstance when calling new in with a generic type with a new() constraint?
Asked Answered
A

5

18

When you have code like the following:

static T GenericConstruct<T>() where T : new()
{
    return new T();
}

The C# compiler insists on emitting a call to Activator.CreateInstance, which is considerably slower than a native constructor.

I have the following workaround:

public static class ParameterlessConstructor<T>
    where T : new()
{
    public static T Create()
    {
        return _func();
    }

    private static Func<T> CreateFunc()
    {
        return Expression.Lambda<Func<T>>( Expression.New( typeof( T ) ) ).Compile();
    }

    private static Func<T> _func = CreateFunc();
}

// Example:
// Foo foo = ParameterlessConstructor<Foo>.Create();

But it doesn't make sense to me why this workaround should be necessary.

Aflame answered 15/12, 2008 at 5:56 Comment(6)
I noticed the same thing... but I don't know why.Hollenbeck
I am using snippet compiler & the compiler doesn't throw any error. Also, the constructor is called when new T() is called.Modica
@shahkalpesh: No-one said there'd be an error. The point is that Activator.CreateInstance is slower than the delegate form.Transmontane
@Jon: Is it at the IL level, the call to Activator.CreateInstance inserted? If so, I did not get it from the question.Modica
@shahkalpesh: Yes. Run Reflector or ildasm over code using new T() (with a new T() constraint, not a struct constraint) and you'll see it.Transmontane
BTW All VB.NET compilers I can test always produce the Activator::CreateInstance call for general, class and structure constraints.Compensation
T
9

I suspect it's a JITting problem. Currently, the JIT reuses the same generated code for all reference type arguments - so a List<string>'s vtable points to the same machine code as that of List<Stream>. That wouldn't work if each new T() call had to be resolved in the JITted code.

Just a guess, but it makes a certain amount of sense.

One interesting little point: in neither case does the parameterless constructor of a value type get called, if there is one (which is vanishingly rare). See my recent blog post for details. I don't know whether there's any way of forcing it in expression trees.

Transmontane answered 15/12, 2008 at 7:0 Comment(0)
M
8

This is likely because it is not clear whether T is a value type or reference type. The creation of these two types in a non-generic scenario produce very different IL. In the face of this ambiguity, C# is forced to use a universal method of type creation. Activator.CreateInstance fits the bill.

Quick experimentation appears to support this idea. If you type in the following code and examine the IL, it will use initobj instead of CreateInstance because there is no ambiguity on the type.

static void Create<T>()
    where T : struct
{
    var x = new T();
    Console.WriteLine(x.ToString());
}

Switching it to a class and new() constraint though still forces an Activator.CreateInstance.

Mafia answered 15/12, 2008 at 7:4 Comment(5)
I guess the immediate followup question would be "why isn't there an appropriate IL instruction for creating an instance of a generic type with an appropriate constraint?" It's not like they couldn't have built that in from the start :)Transmontane
Agreed it really seems like they implemented an API instead of an IL instruction. The comment on the MSDN doc page for Activator.CreateInstance specifically says that it should be called for this scenario. Odd choice, I'm sure there's a good reason.Mafia
I suspect the reason is to increase JIT'd code sharing. If you had a direct call to a type's constructor in the JIT'd code, then you couldn't share that JIT'd code with another instantiation for a different type, e.g. 'T Create&lt;T&gt;() where T : new() {return new T();}' would share machine code for Create&lt;string&gt;() and Create&lt;ArrayList&gt;().Luiseluiza
@JonSkeet Looking back at this five years later, it seems as though this is a growing trend: using static methods to mark places where JIT should take over, as opposed to creating new instructions. A good example would be CER.Simpkins
Just a quick note that this is not true anymore sadly. Regardless of constraint, Roslyn ouputs Activator.CreateInstance.Godbeare
L
3

Why is this workaround necessary?

Because the new() generic constraint was added to C# 2.0 in .NET 2.0.

Expression<T> and friends, meanwhile, were added to .NET 3.5.

So your workaround is necessary because it wasn't possible in .NET 2.0. Meanwhile, (1) using Activator.CreateInstance() was possible, and (2) IL lacks a way to implement 'new T()', so Activator.CreateInstance() was used to implement that behavior.

Luiseluiza answered 10/6, 2009 at 19:12 Comment(0)
A
2

Interesting observation :)

Here is a simpler variation on your solution:

static T Create<T>() where T : new()
{
  Expression<Func<T>> e = () => new T();
  return e.Compile()();
}

Obviously naive (and possible slow) :)

Afterward answered 15/12, 2008 at 7:17 Comment(4)
I don't think that will work, because it's specifically "new T()" that his workaround is trying to avoid.Mornay
@Joel Mueller Actually it does work. Expression tree contains NewExpression here.Unconnected
Yes, it's an Expression of Func<T>, not a Func<T>. The "() => new T()" is not producing IL (thus producing Activator.CreateInstance()), but an expression tree which in turn is compiled at runtime when the T is known. The only problem here is that each time you call this function, you recompile this statement.Employee
This is brilliant, didnt know this could work. For the uninformed, the compiled IL will have instructions to Expression.New and not Activator.CreateInstance. Feels like cheating though..quite unintuitive and less obvious for me.Godbeare
G
2

This is a little bit faster, since the expression is only compiled once:

public class Foo<T> where T : new()
{
    static Expression<Func<T>> x = () => new T();
    static Func<T> f = x.Compile();

    public static T build()
    {
        return f();
    }
}

Analyzing the performance, this method is just as fast as the more verbose compiled expression and much, much faster than new T() (160 times faster on my test PC) .

For a tiny bit better performance, the build method call can be eliminated and the functor can be returned instead, which the client could cache and call directly.

public static Func<T> BuildFn { get { return f; } }
Guarneri answered 15/8, 2009 at 0:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.