C# Time complexity of Array[T].Contains(T item) vs HashSet<T>.Contains(T item)
Asked Answered
S

3

14

HashSet(T).Contains(T) (inherited from ICollection<T>.Contains(T)) has a time complexity of O(1).
So, I'm wondering what the complexity of a class member array containing integers would be as I strive to achieve O(1) and don't need the existence checks of HashSet(T).Add(T).

Since built-in types are not shown in the .NET reference source, I have no chance of finding found the array implementation of IList(T).Contains(T).

Any (further) reading material or reference would be very much appreciated.

Solvency answered 12/5, 2016 at 9:32 Comment(5)
FWIW, Contains is on ICollection<T>, not IList<T>.Lutyens
@JamesThorpe really? msdn.microsoft.com/de-de/library/bb336401(v=vs.110).aspxSolvency
Yes. IList<T> inherits from ICollection<T>.Lutyens
@JamesThorpe It's a lot more likely he meant to ask about List<T>, than ICollection<T>.Aplite
@Aplite Even for List<T>, the Contains method is still defined on the ICollection<T> interface - I was just making a short comment addressing the bold text in the question.Lutyens
N
18

You can see source code of Array with any reflector (maybe online too, didn't check). IList.Contains is just:

Array.IndexOf(this,value) >= this.GetLowerBound(0);

And Array.IndexOf calls Array.IndexOf<T>, which, after a bunch of consistency checks, redirects to

EqualityComparer<T>.Default.IndexOf(array, value, startIndex, count)

And that one finally does:

int num = startIndex + count;
for (int index = startIndex; index < num; ++index)
{
  if (this.Equals(array[index], value))
      return index;
}
return -1;

So just loops over array with average complexity O(N). Of course that was obvious from the beginning, but just to provide some more evidence.

Nannette answered 12/5, 2016 at 9:46 Comment(0)
G
6

Array source code for the .Net Framework (up to v4.8) is available in reference source, and can be de-compiled using ILSpy.

In reference source, you find at line 2753 then 2809:

// -----------------------------------------------------------
// ------- Implement ICollection<T> interface methods --------
// -----------------------------------------------------------

...

[SecuritySafeCritical]
bool Contains<T>(T value) {
    //! Warning: "this" is an array, not an SZArrayHelper. See comments above
    //! or you may introduce a security hole!
    T[] _this = JitHelpers.UnsafeCast<T[]>(this);
    return Array.IndexOf(_this, value) != -1;
}

And IndexOf ends up on this IndexOf which is a O(n) algorithm.

internal virtual int IndexOf(T[] array, T value, int startIndex, int count)
{
    int endIndex = startIndex + count;
    for (int i = startIndex; i < endIndex; i++) {
        if (Equals(array[i], value)) return i;
    }
    return -1;
}

Those methods are on a special class SZArrayHelper in same source file, and as explained at line 2721, this is the implementation your are looking for.

// This class is needed to allow an SZ array of type T[] to expose IList<T>,
// IList<T.BaseType>, etc., etc. all the way up to IList<Object>. When the following call is
// made:
//
//   ((IList<T>) (new U[n])).SomeIListMethod()
//
// the interface stub dispatcher treats this as a special case, loads up SZArrayHelper,
// finds the corresponding generic method (matched simply by method name), instantiates
// it for type <T> and executes it.

About achieving O(1) complexity, you should convert it to a HashSet:

var lookupHashSet = new HashSet<T>(yourArray);
...
var hasValue = lookupHashSet.Contains(testValue);

Of course, this conversion is an O(n) operation. If you do not have many lookup to do, it is moot.

Note from documentation on this constructor:

If collection contains duplicates, the set will contain one of each unique element. No exception will be thrown. Therefore, the size of the resulting set is not identical to the size of collection.

Goodish answered 12/5, 2016 at 9:46 Comment(1)
I am adding all items during initialization and therefore I don't have to use an array at all. But nice to now it supports a ctor for that :)Solvency
A
0

You actually can see the source for List<T>, but you need to look it up online. Here's one source.

Any pure list/array bool Contains(T item) check is O(N) complexity, because each element needs to be checked. .NET is no exception. (If you designed a data structure that manifested as a list but also contained a bloom filter helper data structure, that would be another story.)

Aplite answered 12/5, 2016 at 9:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.