Performance impact of changing to generic interfaces
Asked Answered
O

10

7

I work on applications developed in C#/.NET with Visual Studio. Very often ReSharper, in the prototypes of my methods, advises me to replace the type of my input parameters with more generic ones. For instance, List<> with IEnumerable<> if I only use the list with a foreach in the body of my method. I can understand why it looks smarter to write that but I'm quite concerned with the performance. I fear that the performance of my apps will decrease if I listen to ReSharper...

Can someone explain to me precisely (more or less) what's happening behind the scenes (i.e. in the CLR) when I write:

public void myMethod(IEnumerable<string> list)
{
  foreach (string s in list)
  {
    Console.WriteLine(s);
  }
}

static void Main()
{
  List<string> list = new List<string>(new string[] {"a", "b", "c"});
  myMethod(list);
}

and what is the difference with:

public void myMethod(List<string> list)
{
  foreach (string s in list)
  {
    Console.WriteLine(s);
  }
}

static void Main()
{
  List<string> list = new List<string>(new string[] {"a", "b", "c"});
  myMethod(list);
}
Oatmeal answered 30/6, 2009 at 18:44 Comment(2)
Ok, I know this is a very old discussion but perhaps anything hits it like me searching for help on the same issue. Indeed, there IS a performance penalty in using interfaces and in a way I never had guessed. I put a description here (sorry: German) in my BLog: jochen.jochen-manns.de/index.php/2011/02/05/…Odor
@JMS: I loved that article (unfortunately, not everyone will be able to read it since it is in German). It just goes to show that complexity makes for unpredictability. It is also a prime example of what Jon Skeet calls you'd have to dig into the deep details of [...] hazy understanding of JITting, thunking, vtables and how they apply etc.Nonprofessional
S
11

You're worried about performance - but do you have any grounds for that concern? My guess is that you haven't benchmarked the code at all. Always benchmark before replacing readable, clean code with more performant code.

In this case the call to Console.WriteLine will utterly dominate the performance anyway.

While I suspect there may be a theoretical difference in performance between using List<T> and IEnumerable<T> here, I suspect the number of cases where it's significant in real world apps is vanishingly small.

It's not even as if the sequence type is being used for many operations - there's a single call to GetEnumerator() which is declared to return IEnumerator<T> anyway. As the list gets larger, any difference in performance between the two will get even smaller, because it will only have any impact at all at the very start of the loop.

Ignoring the analysis though, the thing to take out of this is to measure performance before you base coding decisions on it.

As for what happens behind the scenes - you'd have to dig into the deep details of exactly what's in the metadata in each case. I suspect that in the case of an interface there's one extra level of redirection, at least in theory - the CLR would have to work out where in the target object's type the vtable for IEnumerable<T> was, and then call into the appropriate method's code. In the case of List<T>, the JIT would know the right offset into the vtable to start with, without the extra lookup. This is just based on my somewhat hazy understanding of JITting, thunking, vtables and how they apply to interfaces. It may well be slightly wrong, but more importantly it's an implementation detail.

Stanzel answered 30/6, 2009 at 18:50 Comment(5)
Indeed -- particularly in real-world apps where if you're doing anything with networks, databases or disks, it's very unlikely that something like this will be your performance problem.Invertebrate
ok maybe the fact that I give a concrete example hides my real question. And maybe my question was not the good one. Actually I'm more curious about the mechanism of the CLR than performance in itself; Obviously when you write tons of code, the systematic use of most generic types and interface shouldn't be the thing that cost the most.Oatmeal
Actually, you could get a performance boost when using interfaces. As CLR performing caching on interface calls - thing that it won't perform on regular calls.Cholecystitis
@DiVan: Could you give more details?Stanzel
Non documented JIT optimization. Interface method dispatcher tries to play smart. From here linkCholecystitis
D
3

You'd have to look at the generated code to be certain, but in this case, I doubt there's much difference. The foreach statement always operates on an IEnumerable or IEnumerable<T>. Even if you specify List<T>, it will still have to get the IEnumerable<T> in order to iterate.

Dekaliter answered 30/6, 2009 at 18:47 Comment(4)
Actually the foreach statement doesn't require an IEnumerable or IEnumerable<T> to operate. The type just has to have a GetEnumerator() method which returns something with an appropriate MoveNext() method and Current property. But in this case of course List<T> does implement IEnumerable<T>.Stanzel
Also, if the user has a better way of doing things if given a List<T> or Collection<T>, they are free to check if the IEnumerable<T> is one of those classes and special case to improve performance.Serdab
Although they'd usually be better off checking for IList<T> and ICollection<T> :)Stanzel
@Jon: I never knew that. I always thought it used the interface. Do you know of examples of types that have a GetEnumerator() that returns something with a MoveNext/Current, but which is not IEnumerable/IEnumerator?Dekaliter
N
2

In general, I'd say if you are replace the equivalent non-generic interface by the generic flavour (say IList<> --> IList<T>) you are bound to get better or equivalent performance.

One unique selling point is that because, unlike java, .NET does not use type erasure and supports true value types (struct), one of the main differences would be in how it stores e.g. a List<int> internally. This could quite quickly become a big difference depending on how intensively the List is being used.


A braindead synthetic benchmark showed:

    for (int j=0; j<1000; j++)
    {
        List<int> list = new List<int>();
        for (int i = 1<<12; i>0; i--)
            list.Add(i);

        list.Sort();
    }

to be faster by a factor of 3.2x than the semi-equivalent non-generic:

    for (int j=0; j<1000; j++)
    {
        ArrayList list = new ArrayList();
        for (int i = 1<<12; i>0; i--)
            list.Add(i);

        list.Sort();
    }

Disclaimer I realize this benchmark is synthetic, it doesn't actually focus on the use of interfaces right there (rather directly dispatches virtual methods calls on a specific type) etc. However, it illustrates the point I'm making. Don't fear generics (at least not for performance reasons).

Nonprofessional answered 7/7, 2011 at 23:5 Comment(0)
K
1

In general, the increased flexibility will be worth what minor performance difference it would incur.

Keos answered 30/6, 2009 at 18:46 Comment(0)
S
1

The basic reason for this recommendation is creating a method that works on IEnumberable vs. List is future flexibility. If in the future you need to create a MySpecialStringsCollection, you could have it implement the IEnumerable method and still utilize the same method.

Essentially, I think it comes down, unless you're noticing a significant, meaningful performance hit (and I'd be shocked if you noticed any); prefer a more tolerant interface, that will accept more than what you're expecting today.

Siberson answered 30/6, 2009 at 18:47 Comment(0)
C
1

In the first version (IEnumerable) it is more generic and actually you say the method accepts any argument that implements this interface.

Second version yo restrict the method to accept sepcific class type and this is not recommended at all. And the performance is mostly the same.

Circum answered 30/6, 2009 at 18:49 Comment(0)
C
1

The definition for List<T> is:

[SerializableAttribute]
public class List<T> : IList<T>, ICollection<T>, 
    IEnumerable<T>, IList, ICollection, IEnumerable

So List<T> is derived from IList, ICollection, IList<T>, and ICollection<T>, in addition to IEnumerable and IEnumerable<T>.

The IEnumerable interface exposes the GetEnumerator method which returns an IEnumerator, a MoveNext method, and a Current property. These mechanisms are what the List<T> class uses to iterate through the list with foreach and next.

It follows that, if IList, ICollection, IList<T>, and ICollection<T> are not required to do the job, then it's sensible to use IEnumerable or IEnumerable<T> instead, thereby eliminating the additional plumbing.

Crossquestion answered 30/6, 2009 at 18:59 Comment(0)
R
0

An interface simply defines the presence and signature of public methods and properties implemented by the class. Since the interface does not "stand on its own", there should be no performance difference for the method itself, and any "casting" penalty - if any - should be almost too small to measure.

Reedy answered 30/6, 2009 at 18:51 Comment(0)
S
0

There is no performance penalty for a static-upcast. It's a logical construct in program text.

As other people have said, premature optimization is the root of all evil. Write your code, run it through a hotspot analysis before you worry about performance tuning things.

Semple answered 30/6, 2009 at 19:3 Comment(0)
C
0

Getting in IEnumerable<> might create some trouble, as you could receive some LINQ expression with differed execution, or yield return. In both cases you won't have a collection but something you could iterate on. So when you would like to set some boundaries, you could request an array. There is not a problem to call collection.ToArray() before passing parameter, but you'll be sure that there is no hidden differed caveats there.

Cholecystitis answered 23/6, 2011 at 11:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.