C# Set collection?
Asked Answered
H

7

570

Does anyone know if there is a good equivalent to Java's Set collection in C#? I know that you can somewhat mimic a set using a Dictionary or a HashTable by populating but ignoring the values, but that's not a very elegant way.

Hacker answered 8/10, 2008 at 16:33 Comment(0)
K
188

Try HashSet:

The HashSet(Of T) class provides high-performance set operations. A set is a collection that contains no duplicate elements, and whose elements are in no particular order...

The capacity of a HashSet(Of T) object is the number of elements that the object can hold. A HashSet(Of T) object's capacity automatically increases as elements are added to the object.

The HashSet(Of T) class is based on the model of mathematical sets and provides high-performance set operations similar to accessing the keys of the Dictionary(Of TKey, TValue) or Hashtable collections. In simple terms, the HashSet(Of T) class can be thought of as a Dictionary(Of TKey, TValue) collection without values.

A HashSet(Of T) collection is not sorted and cannot contain duplicate elements...

Kasper answered 8/10, 2008 at 16:35 Comment(1)
Unfortunately, HashSets weren't added until just recently. If you're working in an older version of the framework, you're going to have to stick with your munged Dictionary<> or Hashtable.Jetpropelled
A
449

If you're using .NET 3.5, you can use HashSet<T>. It's true that .NET doesn't cater for sets as well as Java does though.

The Wintellect PowerCollections may help too.

Anders answered 8/10, 2008 at 16:36 Comment(8)
does anyone know why it's called HashSet and not just Set?Indirection
I suspect that Set is a keyword in some languages, which could cause issues.Anders
set is also a keyword in C#Sinuation
@Manish: No it's not. See section 2.4.3 of the C# 3 spec. It only has special meaning for properties.Anders
@Jon Thanks for the enlightenment. I always thought it was a keyword. Downloaded the C# 3 spec.Sinuation
The reason for calling it HashSet, instead of just Set, is the same as in Java- "Set" describes an interface, whereas "HashSet" describes an implementation- specifically, this is a Set backed by a Hash Map. In this way, we know (or should strongly expect) that insert and access should take O(1) access time, vs a "LinkedListSet" which would lead us to expect insert and access to take O(n) time.Petiolate
what do you mean ".NET doesn't cater for sets as well as Java does though."? Is this Set somehow imperfect compared to Java's ?Actinomorphic
@Louis: Which Set are you talking about? Java has lots of different implementations of Set for various situations. .NET had one in .NET 3.5 (HashSet) and two in .NET 4 (HashSet and SortedSet). The fact that we had to wait until .NET 3.5 to start with is pretty surprising.Anders
K
188

Try HashSet:

The HashSet(Of T) class provides high-performance set operations. A set is a collection that contains no duplicate elements, and whose elements are in no particular order...

The capacity of a HashSet(Of T) object is the number of elements that the object can hold. A HashSet(Of T) object's capacity automatically increases as elements are added to the object.

The HashSet(Of T) class is based on the model of mathematical sets and provides high-performance set operations similar to accessing the keys of the Dictionary(Of TKey, TValue) or Hashtable collections. In simple terms, the HashSet(Of T) class can be thought of as a Dictionary(Of TKey, TValue) collection without values.

A HashSet(Of T) collection is not sorted and cannot contain duplicate elements...

Kasper answered 8/10, 2008 at 16:35 Comment(1)
Unfortunately, HashSets weren't added until just recently. If you're working in an older version of the framework, you're going to have to stick with your munged Dictionary<> or Hashtable.Jetpropelled
T
36

If you're using .NET 4.0 or later:

In the case where you need sorting then use SortedSet<T>. Otherwise if you don't, then use HashSet<T> since it's O(1) for search and manipulate operations. Whereas SortedSet<T> is O(log n) for search and manipulate operations.

Tema answered 31/8, 2013 at 18:1 Comment(0)
S
16

I use Iesi.Collections http://www.codeproject.com/KB/recipes/sets.aspx

It's used in lot of OSS projects, I first came across it in NHibernate

Silverside answered 8/10, 2008 at 16:36 Comment(0)
S
13

I use a wrapper around a Dictionary<T, object>, storing nulls in the values. This gives O(1) add, lookup and remove on the keys, and to all intents and purposes acts like a set.

Speedwriting answered 26/11, 2009 at 1:8 Comment(1)
You must mean it's roughly equivalent to std::unordered_set. std::set is ordered. For example, you can quickly find the start and end point of a range and iterate from the start to the end, visiting items in key-order. SortedDictionary is roughly equivalent to std::set.Hayward
E
12

Have a look at Power Collections. Apart from Set and OrderedSet it has a few other usefull collection types such as Deque, MultiDictionary, Bag, OrderedBag, OrderedDictionary and OrderedMultiDictionary.

For more collections, there is also the C5 Generic Collection Library.

Enameling answered 8/10, 2008 at 16:38 Comment(0)
F
-8

I know this is an old thread, but I was running into the same problem and found HashSet to be very unreliable because given the same seed, GetHashCode() returned different codes. So, I thought, why not just use a List and hide the add method like this

public class UniqueList<T> : List<T>
{
    public new void Add(T obj)
    {
        if(!Contains(obj))
        {
            base.Add(obj);
        }
    }
}

Because List uses the Equals method solely to determine equality, you can define the Equals method on your T type to make sure you get the desired results.

Francoise answered 4/2, 2012 at 22:50 Comment(2)
The reason you wouldn't want to use this is because List.Contains is of O(n) complexity which means that your Add method now becomes O(n) complexity as well. Assuming the inner collection doesn't need to be resized, Add for both List and HashMap should be of O(1) complexity. TLDR: This will work, but it's hacky and less efficient.Snifter
Sure, if your objects don't return an appropriate value for GetHashCode, you should not put them into a hash-based container. It would be better to fix GetHashCode than to use a less efficient container.Assiduous

© 2022 - 2024 — McMap. All rights reserved.