Restricting a generic type parameters to have a specific constructor
Asked Answered
S

3

26

I'd like to know why the new constraint on a generic type parameter can only be applied without parameters, that is, one may constraint the type to have the parameterless constructor, but one cannot constraint the class to have, say, a constructor that receives a int as a parameter. I know ways around this, using reflection or the factory pattern, that works fine, ok. But I'd really like to know why, because I've been thinking about it and I really can't think of a difference between a parameterless constructor and one with parameters that would justify this restriction on the new constraint. What am I missing? Thanks a lot


Argument 1: Constructors are methods

@Eric: Let me go here with you for a sec:

Constructors are methods

Then I suppose no one would object if I'd go like this:

public interface IReallyWonderful
{
    new(int a);

    string WonderMethod(int a);
}

But once I have that, then I'd go:

public class MyClass<T>
        where T : IReallyWonderful
{
    public string MyMethod(int a, int b)
    {
        T myT = new T(a);
        return myT.WonderMethod(b);
    }
}

Which is what I wanted to do in the first place. So, sorry, but no, constructors are not methods, or at least not exactly.

On the difficulties of implementing this feature, well I'd really wouldn't know, and even if I did, I wouldn't have anything to say on a decision regarding the wisely expenditure of shareholder money. Something like that, I would've marked as an answer right away.

From an academic (my) point of view, and that is, without any regards for implementation costs, the question really is (I've rounded it up to this in the last few hours):

Should constructors be considered as part of the implementation of a class, or as part of the semantic contract (in the same way an interface is considered a semantic contract).

If we consider constructors as part of the implementation, then, constraining the constructor of a generic type parameter is not a very generic thing to do, since that'd be tying up your generic type to a concrete implementation, and one almost could say why use generics at all?

Example of constructor as part of the implementation (no sense in specifying any of the following constructors as part of the semantic contract defined by ITransformer):

public interface ITransformer
{
    //Operates with a and returns the result;
    int Transform(int a);
}

public class PlusOneTransformer : ITransformer
{
    public int Transform(int a)
    {
        return a + 1;
    }
}

public class MultiplyTransformer : ITransformer
{
    private int multiplier;

    public MultiplyTransformer(int multiplier)
    {
        this.multiplier = multiplier;
    }

    public int Transform(int a)
    {
        return a * multiplier;
    }
}

public class CompoundTransformer : ITransformer
{
    private ITransformer firstTransformer;
    private ITransformer secondTransformer;

    public CompoundTransformer(ITransformer first, ITransformer second)
    {
        this.firstTransformer = first;
        this.secondTransformer = second;
    }

    public int Transform(int a)
    {
        return secondTransformer.Transform(firstTransformer.Transform(a));
    }
}

The problem is that constructors may also be considered as part of the semantic contract, like so:

public interface ICollection<T> : IEnumerable<T>
{
    new(IEnumerable<T> tees);

    void Add(T tee);

    ...
}

This means, it's always posible to build a collection from a sequence of elements, right? And that would make a very valid portion of a semantic contract, right?

Me, without taking into account any of the aspects regarding the wisely expenditure of shareholder money, would favour allowing constructors as parts of semantic contracts. Some developer messes it up and constraints a certain type to having a semantically incorrect constructor, well, what's the difference there from the same developer adding a semantically incorrect operation? After all, semantic contracts are that, because we all agreed they are, and because we all document our libraries really well ;)


Argument 2: Supposed problems when resolving constructors

@supercat is been trying to set some examples as how (quote from a comment)

It would also be hard to define exactly how constructor constraints should work, without resulting in surprising behaviors.

but I really must disagree. In C# (well, in .NET) surprises like "How to make a penguin fly?" simply don't happen. There are pretty straightforward rules as to how the compiler resolves method calls, and if the compiler can't resolve it, well, it won't pass, won't compile that is.

His last example was:

If they are contravariant, then one runs into trouble resolving which constructor should be called if a generic type has constraint new(Cat, ToyotaTercel), and the actual type just has constructors new(Animal, ToyotaTercel) and new(Cat, Automobile).

Well, lets try this (which in my opinion is a similar situation to that proposed by @supercat)

class Program
{
    static void Main(string[] args)
    {
        Cat cat = new Cat();
        ToyotaTercel toyota = new ToyotaTercel();

        FunnyMethod(cat, toyota);
    }

    public static void FunnyMethod(Animal animal, ToyotaTercel toyota)
    {
        Console.WriteLine("Takes an Animal and a ToyotaTercel");
    }

    public static void FunnyMethod(Cat cat, Automobile car)
    {
        Console.WriteLine("Takes a Cat and an Automobile");
    }
}

public class Automobile
{ }

public class ToyotaTercel : Automobile
{ }

public class Animal
{ }

public class Cat : Animal
{ }

And, wow, it won't compile with the error

The call is ambiguous between the following methods or properties: 'TestApp.Program.FunnyMethod(TestApp.Animal, TestApp.ToyotaTercel)' and 'TestApp.Program.FunnyMethod(TestApp.Cat, TestApp.Automobile)'

I don't see why the result should be different if the same probleme arouse out of a solution with parameterized constructor constraints, like so:

class Program
{
    static void Main(string[] args)
    {
        GenericClass<FunnyClass> gc = new GenericClass<FunnyClass>();
    }
}

public class Automobile
{ }

public class ToyotaTercel : Automobile
{ }

public class Animal
{ }

public class Cat : Animal
{ }

public class FunnyClass
{
    public FunnyClass(Animal animal, ToyotaTercel toyota)
    {            
    }

    public FunnyClass(Cat cat, Automobile car)
    {            
    }
}

public class GenericClass<T>
   where T: new(Cat, ToyotaTercel)
{ }

Now, of course, the compiler can't handle the constraint on the constructor, but if it could, why could't the error be, on the line GenericClass<FunnyClass> gc = new GenericClass<FunnyClass>(); similar to that obtained when trying to compile the first example, that of the FunnyMethod.

Anyway, I'd go one step further. When one overrides an abstract method or implements a method defined on an interface, one is required to do so with exactly the same parameters type, no inheritors or ancestors allowed. So, when a parameterized constructor is required, the requirement should be met with an exact definition, not with anything else. In this case, the class FunnyClass could never be specified as the type, for the generic parameter type of class GenericClass.

Schoening answered 16/3, 2012 at 16:45 Comment(6)
See Lippert's comments here and here.Hostile
Lippert: "I'm often asked why the compiler does not implement this feature or that feature, and of course the answer is always the same: because no one implemented it. Features start off as unimplemented and only become implemented when people spend effort implementing them: no effort, no feature. This is an unsatisfying answer of course, because usually the person asking the question has made the assumption that the feature is so obviously good that we need to have had a reason to not implement it. We actually don't need a reason to not implement any feature no matter how obviously good."Hostile
Yeah, right, obvious. Are you saying then that this feature does not exist simply because no one implemented it? That is, implementing it would not mean introducing somo inconsistency or contradiction in the language and that there's really no difference, to this regard, between parameterless constructors and other constructors. Right? If so, seems a litle random to me going all the way to implementing the new () constraint and then leaving it at that. But then again, if you invent .NET you invent it the way you want, right?Schoening
possible duplicate of How to constrain generic type to must have a construtor that takes certain parameters?Galvanometer
@PaulieWaulie: Spending shareholder money wisely is not laziness.Hearts
@EricLippert Apologies Eric, but that was very tongue in cheek, I wasn't actually implying you were Lazy or mocking what had stated in your quote.Shatterproof
H
14

Kirk Woll's quote from me of course is all the justification that is required; we are not required to provide a justification for features not existing. Features have enormous costs.

However, in this specific case I can certainly give you some reasons why I would push back on the feature if it came up in a design meeting as a possible feature for a future version of the language.

To start with: consider the more general feature. Constructors are methods. If you expect there to be a way to say "the type argument must have a constructor that takes an int" then why is it not also reasonable to say "the type argument must have a public method named Q that takes two integers and returns a string?"

string M<T>(T t) where T has string Q(int, int)
{
    return t.Q(123, 456);
}

Does that strike you as a very generic thing to do? It seems counter to the idea of generics to have this sort of constraint.

If the feature is a bad idea for methods, then why is it a good idea for methods that happen to be constructors?

Conversely, if it is a good idea for methods and constructors, then why stop there?

string M<T>(T t) where T has a field named x of type string
{
    return t.x;
}

I say that we should either do the whole feature or don't do it at all. If it is important to be able to restrict types to have particular constructors, then let's do the whole feature and restrict types on the basis of members in general and not just constructors.

That feature is of course a lot more expensive to design, implement, test, document and maintain.

Second point: suppose we decided to implement the feature, either the "just constructors" version or the "any member" version. What code do we generate? The thing about generic codegen is that it has been carefully designed so that you can do the static analysis once and be done with it. But there is no standard way to describe "call the constructor that takes an int" in IL. We would have to either add a new concept to IL, or generate the code so that the generic constructor call used Reflection.

The former is expensive; changing a fundamental concept in IL is very costly. The latter is (1) slow, (2) boxes the parameter, and (3) is code that you could have written yourself. If you're going to use reflection to find a constructor and call it, then write the code that uses reflection to find a constructor and call it. If this is the code gen strategy then the only benefit that the constraint confers is that the bug of passing a type argument that does not have a public ctor that takes an int is caught at compile time instead of runtime. You don't get any of the other benefits of generics, like avoiding reflection and boxing penalties.

Hearts answered 16/3, 2012 at 17:27 Comment(15)
But you can already limit a type to have certain method by using interface constraint. I realize it's not the same as the general method constraints you talk about, but it's almost just as good. On the other hand, there is no simple way to limit based on constructor.Galvanometer
@svick: It is almost as good, but not as good. Interfaces cannot require static methods, for example. And the type must implement the interface; you can say "the type must implement ICollection" but not "the type must have a public Add method". Not every type that has an Add method implements ICollection.Hearts
It's not quite so dire - rather than adding anything to IL you could have the C# compiler use the same metadata format that F# uses for exactly that feature (arbitrary member constraints). The F# compiler avoids boxing and reflection, but requires that the "generic" method's body be inlined at all call-sites. That's not to say that this is an appropriate feature for C# given the C# team's priorities, of course. Just that it wouldn't necessarily be a Herculean task to implement.Alonaalone
The types which have an Add method but don't implement ICollection often have an Add method with very different semantics. That makes this feature feel a bit dangerous. I could envision developers not bothering to use interfaces even though interfaces are the right answer, thus leading to other developers erroneously thinking their class fulfills a constraint when it does so syntactically but not semantically.Kent
Constructors would be special because they can be chained but not inherited. That would seem sufficient reason to handle them without handling other types of static methods. I think a bigger difficulty would be with the nested data structures necessary to define parameterized constructor constraints (especially when the constructor parameters themselves could be generic). That plus the fact that one can already use delegates to do 99% of what is needed.Phosphorite
@eric: I've edited my question to go here with you for a sec: "Constructors are methods".Schoening
"Does that strike you as a very generic thing to do? It seems counter to the idea of generics to have this sort of constraint." -- It does not seem counter at all; for instance "static T Parse(string)" is implemented in all sorts of classes, yet it's not derived from any base classes. Why shouldn't I be able to enforce "where T: has static T Parse(string) "? Not to mention constraint checks are done at COMPILE time, and IL codegen is a non-issue, since we have the DLR, and Reflection. That is "I want any class that has a 'public static T Parse(string)' method" is still very generic.Supererogate
I know this is an old question, but I just had a few thoughts. If a type parameter is constrained to inherit from a type with a particular public constructor, wouldn't any type standing in for that parameter be guaranteed to have a constructor with a particular signature? In that case, it shouldn't be too hard to determine which constructor the generic type argument should use at compile time, since you already know which type it inherits from and whether that type has a constructor with a particular signature. I am entirely IL illiterate, so please excuse any fundamental errors.Bernita
@Asad: which compile time? C# compiler compile time or JIT compiler compile time?Hearts
@EricLippert Whenever you're generating your IL (I guess this would be the C# compiler compile time)?Bernita
I mean the constructor that you're supposed to call is a well known, concrete method. The signature and implementation are not ambiguous at compile time. If you know the base class, and you know the signature, and you know that there can only be one public constructor with a particular signature, then you can deduce which function on the base class we're talking about (if indeed it is even there, else build is no bueno). I think I should learn more about IL before trying to communicate about this, because even reading back over my comment it feels like I'm just handwaving.Bernita
@Asad: OK, so you compile your public generic type with the constraint into a library. Then three years later someone uses that library with a particular class with a particular constructor that matches the pattern. How did the compiler of the original library know that years later the generic type was going to be instantiated with that particular type with that particular constructor?Hearts
I don't entirely understand this objection. Why would the original author need to know about a derivatives of the type their generic parameter is constrained to inherit from? The person who is implementing the generic class only needs to know about what the public constructor of our hypothetical parameter's base type looks like and what it's for. If you're typing class MyGeneric<T> where T : TWithConstructor at your computer, it is reasonable to assume you know what TWithConstructor is, what its exposed constructor looks like, and what you need to pass it, no?Bernita
@Asad: OK, put it another way. What IL do you expect to be generated at a site to the call of the constructor? You have T M() { return new T(123); } what is the IL generated for the body of M() ?Hearts
@EricLippert What IL-Body is generated for T M() { return new T(); } if you have the new()-constraint? And what does the IL-Code for any other method call than a constructor call look like? (Sorry when I dig out a year old discussion)Brawn
H
5

Summary

This is an attempt to capture the current information and workarounds on this question and present it as an answer.

I find Generics combined with Constraints one of the most powerful and elegant aspects of C# (coming from a C++ templates background). where T : Foo is great as it introduces capability to T while still constraining it to Foo at compile time. In many cases, it has made my implementation simpler. At first, I was a bit concerned as using generic types in this way can cause generics to grow through the code, but I have allowed it to do so and the benefits have greatly outweighed any downsides. However, the abstraction always falls down when it comes to constructing a Generic type that takes a parameter.

The Problem

When constraining a generic class, you are able to indicate that the generic must have a parameterless constructor and then instantiate it:

public class Foo<T>
   where T : new()
{
    public void SomeOperation()
    {
        T something = new T();
        ...
    }
}

The problem is that one is only ably to constrain for parameterless constructors. This means that one of the workarounds suggested below needs to be used for constructors that have parameters. As described below, the workarounds have drawbacks ranging from requiring additional code to being very dangerous. Also, if I have a class that has a public parameterless constructor that is used by a generic method, but somewhere down the track that class is changed so that the constructor now has a parameter then I need to change the design of the template and surrounding code to use one of the workarounds rather than new().

This is something that Microsoft definitely knows about, see these links on Microsoft Connect just as a sample (not counting the confused Stack Overflow users asking the question) here here here here here here here.

They are all closed as 'Won't Fix' or 'By Design'. The sad thing about that is that the issue is then locked and it is no longer possible to vote them up. However you can vote here for the constructor feature.

The Workarounds

There are three main types of workarounds, none of which are ideal: -

  1. Use Factories. This requires a whole lot of boilerplate code and overhead
  2. Use Activator.CreateInstance(typeof(T), arg0, arg1, arg2, ...). This is my least favourite as type safety is lost. What if down the track you add a parameter to the constructor of type T? You get a runtime exception.
  3. Use the Function/action approach. and here. This is my favourite as it retains type safety and requires less boilerplate code. However, it is still not as simple as new T(a,b,c) and as a generic abstraction often spans many classes, the class that knows the type is often a few classes away from the class that needs to instantiate it so that func gets passed around resulting in unnecessary code.

The Explanations

A standard response is provided on Microsoft Connect which is:

"Thank you for your suggestion. Microsoft has received a number of suggestions on changes to the constraint semantics of generic types, as well as doing its own work in this area. However at this time Microsoft cannot give any undertaking that changes in this area will be part of a future product release. Your suggestion will be noted to help drive decisions in this area. In the meantime the code sample below..."

The workaround is actually not my recommended workaround of all the options as it is not type safe and results in a runtime exception if you ever happen to add another parameter to the constructor.

The best explanation I can find is offered by Eric Lippert in this very stack overflow post. I am very appreciative of this answer, but I think that further discussion is required on this at a user level and then at the technical level (probably by people who know more than me about the internals).

I also recently spotted that there is a good and detail by Mads Torgersen in this link (see "Posted by Microsoft on 3/31/2009 at 3:29 PM").

The problem is that constructors are different from other methods in that we can already constrain methods as much as we need to by way of derivation constraints (interface or base class). There may be some cases where method constraints are beneficial, I have never needed them, however I continuously hit the parameterless constructor limitation. Of course, a general (not constructor-only) solution would be ideal and Microsoft would need to decide on this themselves.

The Suggestions

Regarding the debatable benefit vs difficulty of implementation, I can appreciate this, but would make the following points: -

  1. There is great value in catching bugs at compile time rather than runtime (type safety in this case).
  2. There seem to be other options that are not so dire. There have been a few suggestions on how this might be implemented. Notably, Jon Skeet proposed 'static interfaces' as a way to solve this and it appears that explicit member constraints already exist in the CLR, but not in C#, see the comments here and the discussion here. Also, the comment by kvb in Eric Lippert's response about the arbitrary member constraints.

The Status

Not about to happen in any shape of form as far as I can tell.

Halfwitted answered 2/9, 2013 at 4:43 Comment(0)
P
2

If one wants to have a method with a generic type T whose instances can be created using a single int parameter, one should have the method accept, in addition to type T, either a Func<int, T> or else a suitably-defined interface, possibly using something like:

static class IFactoryProducing<ResultType>
{
    interface WithParam<PT1>
    {
        ResultType Create(PT1 p1);
    }
    interface WithParam<PT1,PT2>
    {
        ResultType Create(PT1 p1, PT2 p2);
    }
}

(the code would seem nicer if the outer static class could be declared as an interface, but IFactoryProducing<T>.WithParam<int> seems clearer than IFactory<int,T> (since the latter is ambiguous as to which type is the parameter and which is the result).

In any case, whenever one passes aroud type T one also passes around a suitable factory delegate or interface, one can achieve 99% of what one could achieve with parameterized constructor constraints. The run-time cost can be minimized by having each constructable type generate a static instance of a factory, so it won't be necessary to create factory instances in any sort of looping context.

BTW, beyond the cost of the feature, there would almost certainly be some substantial limitations which would make it less versatile than the workaround. If constructor constraints are not contravariant with regard to parameter types, it may be necessary to pass around a type parameter for the exact type required for the constructor constraint, in addition to the actual type of the parameter to be used; by the time one does that, one might as well pass around a factory. If they are contravariant, then one runs into trouble resolving which constructor should be called if a generic type has constraint new(Cat, ToyotaTercel), and the actual type just has constructors new(Animal, ToyotaTercel) and new(Cat, Automobile).

PS--To clarify the problem, contravariant constructor constraints lead to a variation of the "double diamond" problem. Consider:

T CreateUsingAnimalAutomobile<T>() where T:IThing,new(Animal,Automobile)
{ ... }

T CreateUsingAnimalToyotaTercel<T>() where T:IThing,new(Animal,ToyotaTercel)
{ return CreateUsingAnimalAutomobile<T>(); }

T CreateUsingCatAutomobile<T>() where T:IThing,new(Cat,Automobile)
{ return CreateUsingAnimalAutomobile<T>(); }

IThing thing1=CreateUsingAnimalToyotaTercel<FunnyClass>(); // FunnyClass defined in question
IThing thing2=CreateUsingCatAutomobile<FunnyClass>(); // FunnyClass defined in question

In processing the call to CreateUsingAnimalToyotaTercel<FunnyClass>(), the "Animal,ToyotaTercel" constructor should satisfy the constraint for that method, and the generic type for that method should satisfy a constraint for CreateUsingAnimalAutomobile<T>(). In processing the call to CreateUsingCatAutomobile<FunnyClass>(), the "Cat,Automobile" constructor should satisfy the constraint for that method, and the generic type for that method should satisfy the constraint for CreateUsingAnimalAutomobile<T>().

The problem is that both calls will invoke a call to the same CreateUsingAnimalAutomobile<SillyClass>() method, and that method has no way of knowing which constructor should be invoked. Contravariance-related ambiguities aren't unique to constructors, but in most cases they're resolved through compile-time binding.

Phosphorite answered 16/3, 2012 at 22:29 Comment(9)
Yeah, I know the workarounds and have used them. I just wanted to know if there was a, say, conceptual reason for not having parameterized constructor constraints like "look, you haven't taken into account this case here, where having a parameterized constructor constraints may send your app to hell" or, "no, no reason whatsoever, just that it was too costly or to lengthy". ThanxsSchoening
The fact that the workaround exists means there is limited payoff to implementing the feature. It would also be hard to define exactly how constructor constraints should work, without resulting in surprising behaviors.Phosphorite
The workaround for properties also exist, and is very straightforward, much so than this workaround (getter and setter methods), and yet, that does not meant they din't get implementing. So, that's not really an answer. After all, there are also workarounds for developing with .NET (like using Java, for instances) and that doesn't mean .NET doesn't get implemented, and thank god for it, right?Schoening
As for your example above of constructors taking as parameters animals and automobiles, well, i think your example is wrong, if the actual type has only one constructor, there's really nothing to resolve, the only constructor that may be called is the only constructor there is. Anyway, I think that it would be really straight forward, when, hypothetycally, resolving constructors to apply exactly the same rules that are applied to same named methods, really, don't see the problem in there.Schoening
@amandatarafa: I got my example backward--fixed. Which workaround for properties are you talking about? I'd really like to see a nice means by which something like MyCollectionOfPoint[whatever].X += 5 could be translated to something like MyCollectionOfPoint.ActOnItem(whatever, (ref Point item) => item.X += 5), giving the collection a chance to wrap the update however it sees fit. Even the collection holds class items rather than structs, the only way to find out when they're updated is to use all sorts of tricky callback or event logic.Phosphorite
What about MyCollectionOfPoint.ActOnItem(whatever, (ref Point item) => item.SetX(item.GetX() + 5)); is uglier, true, but that's what it is a workaround, no? Anyway, is probable, that if you look hard enough, you'll find something for which there's no workaround without properties, but that would only fall on the 1% (100%-99%) of cases, that, according to you, would mean a limited payoff to implementing the feature.Schoening
@amandatarafa: My point was that it would be nice if one could use the former syntax rather than the latter. To properly expose properties in all cases (e.g. when many appear as ref parameters in the same function call) would require some kind of variadic generic or immutable-collection-of-ref type which does not yet exist, but I would think that allowing collection to expose items by ref and know when modifications are complete would be far nicer than having to use class objects with callbacks.Phosphorite
Well then, I don't think your problem has nothing to do with the implementation of properties or the use of the workaround for them (getter an setter methods). That is, your problem exist wether the feature "properties" is implemented or not, in one case you want the setter of the property to somehow notify the holding collection that it has changed and in the other (when using the workaround) you want the setter method to do the notification, right?.Schoening
As for the example on the constructor having a constraint and an actual type meeting that constraint, I've edited my question to elaborate a little on this. See EDIT 2Schoening

© 2022 - 2024 — McMap. All rights reserved.