Why is List<T>.IndexOf() a lot faster than List<T>.Contains()?
Asked Answered
S

1

16

I have List which has 150K elements. Average time of work IndexOf() is 4 times lower than Contains(). I tried to use List of int. For List of strings IndexOf is a bit faster.

I found only one main difference, it's attribute TargetedPatchingOptOut. MSDN tells:

Indicates that the .NET Framework class library method to which this attribute is applied is unlikely to be affected by servicing releases, and therefore is eligible to be inlined across Native Image Generator (NGen) images.

Could this attribute be a reason of such behavior? And why doesn't method Contains() have such attribute?

Thanks in advance.

EDIT:

I have code something like this:

List<int> list = CommonHelper.GetRandomList(size);

long min = long.MaxValue;
long max = 0;
long sum = 0;

foreach (var i in list)
{
    m_stopwatch.Reset();
    m_stopwatch.Start();
    list.Contains(i); // list.IndexOf(i);
    m_stopwatch.Stop();

    long ticks = m_stopwatch.ElapsedTicks;

    if (ticks < min)
        min = ticks;

    if (ticks > max)
        max = ticks;

    sum += ticks;
}

long averageSum = sum / size;

EDIT 2:

I have written same code as in IndexOf() and it work slower than Contains().

Scholz answered 28/10, 2010 at 5:2 Comment(12)
What is the data in this case?Doubleripper
And no - I don't think the attribute has anything to do with it.Doubleripper
I use int and string, behavior is same.Scholz
Can you make a short reproducible test for us that exhibits the behavior?Hogen
Why do you have a list of 150K elements? If you're looking through it, you may have the wrong data structure.Morelock
Kobi, I don't have such lists in production, I found it when I played with different data structures.Scholz
As a side note while I look, would a HashSet<T> be usable in your case? or maybe a SortedList?Doubleripper
Marc, Yes as I said in production I use another structures like HashSet<T>, it depends on issues.Scholz
@Pavel: Awesome. I get a repro here on a "Release" build. The results are surprising. As you said to Marc, the difference isn't as big, but it still there for List<string>Hogen
@Merlyn, yes I tried to use Debug and Release builds.Scholz
For info, I tried benchmarking with/without a null-check (boxing), and it wasn't that...Doubleripper
Thanks for the tip. Cut my time in at least half.Rate
F
4

They each arrive at the method to determine equality slightly differently, according to their MSDN entries. Look under the 'remarks' of each of these entries:

List<T>.IndexOf uses EqualityComparer<T>.Default http://msdn.microsoft.com/en-us/library/e4w08k17.aspx

List<T>.Contains uses IEquatable<T>.Equals http://msdn.microsoft.com/en-us/library/bhkz42b3.aspx

Even if they end up calling the same method to determine equality in the very end (as is certainly the case here), they are taking different routes to get there, so that probably 'splains it.

Given that the "4x difference" seems not to be the actual case, some off-handed boxing might account for some difference, particularly with a 150k sized set of data

Fixer answered 28/10, 2010 at 5:22 Comment(7)
Interesting. What is the reasoning behind this difference?Biggin
I really don't know offhand; in fact, it seemed so unlikely they would be different, I almost didn't check to see what methods they used to test equality.Fixer
Wouldn't you expect IEquatable to be faster though? Also a difference of x4 seems quite a lot.Morelock
@Morelock ; those are my initial thoughts, too. My mind is too foggy to dig too much more tonight, though. heheFixer
Sorry, but no: List<T> also uses EqualityComparer<T> - which provides support for IEquatable<T>.Doubleripper
I worded that a bit inartfully; my point being that there is a slight difference in how the comparison is arrived at.Fixer
The comparison is the same (except or some indirection), except for the null-check (which I have profiled, and isn't the cause)Doubleripper

© 2022 - 2024 — McMap. All rights reserved.