Why is this List<>.IndexOf code so much faster than the List[i] and manual compare?
Asked Answered
F

3

10

I'm running AQTime on this piece of code, I found that .IndexOf takes 16% of the time vs close to 80% for the other piece... They appear to use the same IsEqual and other routines. Called 116,000 times inserting 30,000 items. None of the List<> objects gets over 200 elements. (I may be using AQTime incorrectly, I'm looking into this)

class PointD : IEquatable<PointD>
{
    public double X, Y, Z;

    bool IEquatable<PointD>.Equals(PointD other)
    {
        return ((X == other.X) && (Y == other.Y) && (Z == other.Z));
    }
}

class PerfTest
{
    readonly List<PointD> _pCoord3Points = new List<PointD>();
    public int NewPoints;
    public int TotalPoints;

    public PerfTest()
    {
        NewPoints = 0;
        TotalPoints = 0;
    }
    public int CheckPointIndexOf(PointD pt)
    {
        int retIndex = _pCoord3Points.IndexOf(pt);
        if (retIndex < 0)
        {
            _pCoord3Points.Add(pt);
            NewPoints++;
        }
        TotalPoints++;
        return retIndex;    
    }

    public int CheckPointForBreak(PointD pt)
    {
        int retIndex = -1;
        for (int i = 0; i < _pCoord3Points.Count; i++)
        {
            PointD otherPt = _pCoord3Points[i];
            if ((pt.X == otherPt.X) &&
                (pt.Y == otherPt.Y) &&
                (pt.Z == otherPt.Z))
            {
                retIndex = i;
                break;
            }
        }
        if (retIndex == -1)
        {
            NewPoints++;
            _pCoord3Points.Add(pt);
        }
        TotalPoints++;
        return retIndex;
    }

    static void Main()
    {
        const int xmax = 300;
        const int ymax = 10;
        const int zmax = 10;
        const int imax = 4;

        var test = new PerfTest();
        //test.Init();
        Stopwatch sw = Stopwatch.StartNew();
        for (int i = 0; i < imax; i++)
        {
            for (int x = 0; x < xmax; x++)
            {
                for (int y = 0; y < ymax; y++)
                {
                    for (int z = 0; z < zmax; z++)
                    {
                        var pt = new PointD { X = x, Y = y, Z = z };
                        test.CheckPointIndexOf(pt);
                    }
                }
            }

        } 
        sw.Stop();
        string output = string.Format("Total: {0:0} New: {1:0} IndexOf: ", test.TotalPoints, test.NewPoints);
        Console.Write(output);
        Console.WriteLine(sw.Elapsed);

        test = new PerfTest();
        sw = Stopwatch.StartNew();
        for (int i = 0; i < imax; i++)
        {
            for (int x = 0; x < xmax; x++)
            {
                for (int y = 0; y < ymax; y++)
                {
                    for (int z = 0; z < zmax; z++)
                    {
                        var pt = new PointD { X = x, Y = y, Z = z };
                        test.CheckPointForBreak(pt);
                    }
                }
            }

        } 
        sw.Stop();
        output = string.Format("Total: {0:0} New: {1:0} PointD[] ", test.TotalPoints, test.NewPoints);
        Console.Write(output);
        Console.WriteLine(sw.Elapsed);
        Console.ReadLine();
    }
}
Fib answered 7/9, 2010 at 21:46 Comment(3)
Is really IndexOf faster? When I tried to reproduce IndexOf was significantly slower, and I assume Jon's speculation about boxing is correct.Latria
I'm getting the exact opposite result, IndexOf is 20 times slower when PointD is a struct. The Equals() method of a struct is not cheap. Post a real verifiable example.Oberheim
Ah, if IndexOf is faster on your machine but not with anyone else's, it's probably because of your profiler. A lot of profilers don't equally penalize code that they can't poke around in.Shunt
L
11

I made the following assumptions:

  • PointD is a struct
  • IndexOf is indeed slower than manually iterating the list

You can speed up IndexOf by implementing the IEquatable<T> interface:

struct PointD : IEquatable<PointD>
{
    public int X;
    public int Y;
    public int Z;

    public bool Equals(PointD other)
    {
        return (this.X == other.X) &&
                (this.Y == other.Y) &&
                (this.Z == other.Z);
    }
}

Without implementing the IEquatable<T> interface, IndexOf will compare the two PointD elements using ValueType.Equals(object other) which involves expensive boxing operations.

The documentation of the IEquatable<T> interface states:

The IEquatable<T> interface is used by generic collection objects such as Dictionary<TKey, TValue>, List<T>, and LinkedList<T> when testing for equality in such methods as Contains, IndexOf, LastIndexOf, and Remove. It should be implemented for any object that might be stored in a generic collection.

Here is my complete benchmark code:

using System;
using System.Collections.Generic;
using System.Diagnostics;

struct PointD 
{
    public int X;
    public int Y;
    public int Z;
}

class PerfTest
{
    List<PointD> _pCoord3Points = new List<PointD>();

    int checkPointIndexOf(PointD pt)
    {
        return _pCoord3Points.IndexOf(pt);  
    }

    int checkPointForBreak(PointD pt)
    {
        int retIndex = -1;
        for (int i = 0; i < _pCoord3Points.Count; i++)
        {
            PointD otherPt = _pCoord3Points[i];
            if ((pt.X == otherPt.X) &&
                (pt.Y == otherPt.Y) &&
                (pt.Z == otherPt.Z))
                retIndex = i;
            break;
        }
        return retIndex;
    }

    void init()
    {
        for (int x = 0; x < 100; x++)
        {
            for (int y = 0; y < 10; y++)
            {
                for (int z = 0; z < 10; z++)
                {
                    PointD pt = new PointD() { X = x, Y = y, Z = z };
                    _pCoord3Points.Add(pt);
                }
            }
        }
    }

    static void Main(string[] args)
    {
        PerfTest test = new PerfTest();
        test.init();
        Stopwatch sw = Stopwatch.StartNew();
        for (int x = 0; x < 100; x++)
        {
            for (int y = 0; y < 10; y++)
            {
                for (int z = 0; z < 10; z++)
                {
                    PointD pt = new PointD() { X = x, Y = y, Z = z };
                    test.checkPointIndexOf(pt);
                }
            }
        }
        sw.Stop();
        Console.WriteLine(sw.Elapsed);
        sw = Stopwatch.StartNew();
        for (int x = 0; x < 100; x++)
        {
            for (int y = 0; y < 10; y++)
            {
                for (int z = 0; z < 10; z++)
                {
                    PointD pt = new PointD() { X = x, Y = y, Z = z };
                    test.checkPointForBreak(pt);
                }
            }
        }
        sw.Stop();
        Console.WriteLine(sw.Elapsed);
    }
}

On Windows 7, x64, .NET 4.0 x64 build I get the following timings (approx):

IndexOf:  00:00:04.8096623
For Loop: 00:00:00.0014203

When turning PointD into a class the timings change to

IndexOf:  00:00:01.0703627
For Loop: 00:00:00.0014190

When using a struct PointD implementing IEquatable<PointD> I get more "similar" results, but IndexOf is still slower (there is no significant difference when using a class now):

IndexOf:  00:00:00.3904615
For Loop: 00:00:00.0015218
Latria answered 7/9, 2010 at 22:37 Comment(4)
@0xA3: Could you call List<PointD>.IndexOf once before you start the stop-watch to eliminate the JIT cost + one-time initialization (static constructor of EqualityComparer<PointD> to determine the right comparer etc) that happens under the hood? Not that I would consider it would matter, considering the order of magnitude difference we are seeing.Supportable
@Ani: Yes, you are correct that the benchmark is not accurate with respect to JIT overhead. But I decided to neglect the effect. In fact, if you warm up first, you will see that the JIT overhead for the for loop is more significant which seems reasonable as List<T>.IndexOf is contained in the native image of mscorlib.dll.Latria
it ends up IndexOf 8.08... and [] 10.018... that is my question why? Thanks for all the work on your current answerFib
When I changed from struct to class that changed it. Got It. again thanksFib
S
4

Normally, before you access an array element, it checks to make sure that the index is >= 0 and < length -- so that you don't read or overwrite memory that doesn't belong to you. Among other things, it eliminates a slew of severe security flaws called buffer overflows.

Needless to say, that impedes performance if you have very little code within your loop. To lessen this problem, the JIT compiler optimizes for-loops of the form for (i = 0; i < array.Length; i++) { array[i]; } -- that is, any loop that accesses all the indices of an array from 0 to length - 1. It omits the bounds checking for this case. (The optimization fails if you access things like array[i + 1], for the reason that you might step over the bounds.)

Unfortunately this only works with arrays, not with List<>s. So your latter code will not be optimized.

But since a List<> internally contains an array, and List.IndexOf() uses a loop to access each value in the array directly, it can be optimized.

By the way, it's better to say for (int i = 0; i < array.Length; i++) { } than it is to say int length = array.Length; for(int i = 0; i < length; i++) { } -- because it can't be sure that length really is the length of the array.

Edit: looking at the IndexOf code using Reflector, the loop will indeed optimize, but as the other people here have mentioned, it calls Equals() -- which requires a vtable lookup and various sanity checks. So in this case, IndexOf might in fact be slower when you're not running it with a profiler.

The disassembled code:

internal virtual int IndexOf(T[] array, T value, int startIndex, int count)
{
    int num = startIndex + count;
    for (int i = startIndex; i < num; i++)
    {
        if (this.Equals(array[i], value))
        {
            return i;
        }
    }
    return -1;
}
Shunt answered 7/9, 2010 at 22:25 Comment(1)
Crazy random knowledge of .NET there!Numerary
D
0

What's the type of _pCoord3Points? If the element type is a value type which only overrides Equals(object) then it's possible that IndexOf is repeatedly boxing values to check for equality. That might explain it. It's really just guesswork at this point though... if you could provide a short but complete program which demonstrates the problem, that would make it a lot easier to help you.

Donnydonnybrook answered 7/9, 2010 at 21:50 Comment(2)
But if IndexOf involved boxing then it should be slower, not faster as the OP claims, shouldn't it?Latria
Indeed, Jon, you seem to be right. I measured and IndexOf is significantly slower (in my measurement around factor 1000) for value types and still around a factor of 100 slower for reference types.Latria

© 2022 - 2024 — McMap. All rights reserved.