How to retrieve actual item from HashSet<T>?
Asked Answered
M

11

100

I've read this question about why it is not possible, but haven't found a solution to the problem.

I would like to retrieve an item from a .NET HashSet<T>. I'm looking for a method that would have this signature:

/// <summary>
/// Determines if this set contains an item equal to <paramref name="item"/>, 
/// according to the comparison mechanism that was used when the set was created. 
/// The set is not changed. If the set does contain an item equal to 
/// <paramref name="item"/>, then the item from the set is returned.
/// </summary>
bool TryGetItem<T>(T item, out T foundItem);

Searching the set for an item with such a method would be O(1). The only way to retrieve an item from a HashSet<T> is to enumerate all items which is O(n).

I haven't find any workaround to this problem other then making my own HashSet<T> or use a Dictionary<K, V>. Any other idea?

Note:
I don't want to check if the HashSet<T> contains the item. I want to get the reference to the item that is stored in the HashSet<T> because I need to update it (without replacing it by another instance). The item I would pass to the TryGetItem would be equal (according to the comparison mechanism that I've passed to the constructor) but it would not be the same reference.

Mcconnell answered 13/10, 2011 at 21:0 Comment(8)
Why not use Contains and return the item you passed as an input?Calceolaria
possible duplicate of Why can't I retrieve an item from a HashSet without enumeration?Entrenchment
If you need to look up an object based on a key value, then Dictionary<T> may be the more appropriate collection to store it in.Patagium
@ThatBlairGuy: You are right. I think I will implement my own Set collection using a Dictionary internally to store my items. The key will be the HashCode of the item. I will have approximately same performance as a HashSet and it will save me having to provide a key each time I need to add/remove/get an item from my collection.Mcconnell
@chaf, The hashcode as a key is not a good idea. Using it for uniqueness is using it wrong, as it is never intended to be unique. Don't get in a pattern of using the wrong tools for the wrong jobs. Encapsulating a dictionary in a friendlier-to-use collection is not a bad idea otherwise.Heterotrophic
For more on why the hashcode-as-key is a bad idea, read here. And for general guidelines around GetHashCode, read here.Heterotrophic
possible duplicate of How to access the reference values of a HashSet<TValue> without enumeration?Dutton
@mathias Because the hashset might contain an item that equals the input, but is not actually the same. For example you might want to have a hashset of reference types but you want to compare the content, not the reference for equality.Zilla
B
47

What you're asking for was added to .NET Core a year ago, and was recently added to .NET 4.7.2:

In .NET Framework 4.7.2 we have added a few APIs to the standard Collection types that will enable new functionality as follows.
- ‘TryGetValue‘ is added to SortedSet and HashSet to match the Try pattern used in other collection types.

The signature is as follows (found in .NET 4.7.2 and above):

    //
    // Summary:
    //     Searches the set for a given value and returns the equal value it finds, if any.
    //
    // Parameters:
    //   equalValue:
    //     The value to search for.
    //
    //   actualValue:
    //     The value from the set that the search found, or the default value of T when
    //     the search yielded no match.
    //
    // Returns:
    //     A value indicating whether the search was successful.
    public bool TryGetValue(T equalValue, out T actualValue);

P.S.: In case you're interested, there is related function they're adding in the future - HashSet.GetOrAdd(T).

Birdcage answered 18/6, 2018 at 12:50 Comment(0)
H
72

This is actually a huge omission in the set of collections. You would need either a Dictionary of keys only or a HashSet that allows for the retrieval of object references. So many people have asked for it, why it doesn't get fixed is beyond me.

Without third-party libraries the best workaround is to use Dictionary<T, T> with keys identical to values, since Dictionary stores its entries as a hash table. Performance-wise it is the same as the HashSet, but it wastes memory of course (size of a pointer per entry).

Dictionary<T, T> myHashedCollection;
...
if(myHashedCollection.ContainsKey[item])
    item = myHashedCollection[item]; //replace duplicate
else
    myHashedCollection.Add(item, item); //add previously unknown item
...
//work with unique item
Halda answered 1/2, 2013 at 17:27 Comment(2)
I would suggest that the keys to his dictionary should be whatever he has currently placed in his EqualityComparer for the hashset. I feel it is dirty to use an EqualityComparer when you aren't really saying the items are equal (otherwise you could just use the item you created for the purpose of the comparison). I'd make a class/struct that represents the key. This comes at the cost of more memory of course.Mornay
Since key is stored inside Value I suggest using collection inherited from KeyedCollection instead of Dictionary. msdn.microsoft.com/en-us/library/ms132438(v=vs.110).aspxSatirist
B
47

What you're asking for was added to .NET Core a year ago, and was recently added to .NET 4.7.2:

In .NET Framework 4.7.2 we have added a few APIs to the standard Collection types that will enable new functionality as follows.
- ‘TryGetValue‘ is added to SortedSet and HashSet to match the Try pattern used in other collection types.

The signature is as follows (found in .NET 4.7.2 and above):

    //
    // Summary:
    //     Searches the set for a given value and returns the equal value it finds, if any.
    //
    // Parameters:
    //   equalValue:
    //     The value to search for.
    //
    //   actualValue:
    //     The value from the set that the search found, or the default value of T when
    //     the search yielded no match.
    //
    // Returns:
    //     A value indicating whether the search was successful.
    public bool TryGetValue(T equalValue, out T actualValue);

P.S.: In case you're interested, there is related function they're adding in the future - HashSet.GetOrAdd(T).

Birdcage answered 18/6, 2018 at 12:50 Comment(0)
M
12

This method has been added to .NET Framework 4.7.2 (and .NET Core 2.0 before it); see HashSet<T>.TryGetValue. Citing the source:

/// <summary>
/// Searches the set for a given value and returns the equal value it finds, if any.
/// </summary>
/// <param name="equalValue">The value to search for.
/// </param>
/// <param name="actualValue">
/// The value from the set that the search found, or the default value
/// of <typeparamref name="T"/> when the search yielded no match.</param>
/// <returns>A value indicating whether the search was successful.</returns>
/// <remarks>
/// This can be useful when you want to reuse a previously stored reference instead of 
/// a newly constructed one (so that more sharing of references can occur) or to look up
/// a value that has more complete data than the value you currently have, although their
/// comparer functions indicate they are equal.
/// </remarks>
public bool TryGetValue(T equalValue, out T actualValue)
Manvel answered 7/5, 2018 at 17:27 Comment(1)
As well as for SortedSet as well.Dutton
M
5

What about overloading the string equality comparer:

  class StringEqualityComparer : IEqualityComparer<String>
{
    public string val1;
    public bool Equals(String s1, String s2)
    {
        if (!s1.Equals(s2)) return false;
        val1 = s1;
        return true;
    }

    public int GetHashCode(String s)
    {
        return s.GetHashCode();
    }
}
public static class HashSetExtension
{
    public static bool TryGetValue(this HashSet<string> hs, string value, out string valout)
    {
        if (hs.Contains(value))
        {
            valout=(hs.Comparer as StringEqualityComparer).val1;
            return true;
        }
        else
        {
            valout = null;
            return false;
        }
    }
}

And then declare the HashSet as:

HashSet<string> hs = new HashSet<string>(new StringEqualityComparer());
Metic answered 21/9, 2015 at 8:58 Comment(4)
This is all about memory management - returning the actual item that is in the hashset rather than an identical copy. So in the above code we find the string with the same content and then return a reference to this. For strings this is a similar to what interning does.Metic
@zumalifeguard @Metic this is not guaranteed to work as-is. It would require someone instantiating the HashSet to provide the specific value converter. An optimal solution would be for TryGetValue to pass in a new instance of the specialized StringEqualityComparer (otherwise the as StringEqualityComparer could result in a null causing the .val1 property access to throw). In doing so, StringEqualityComparer can become a nested private class within HashSetExtension. Futher, in case of an overridden equality comparer, the StringEqualityComparer should call into the default.Cichocki
you need to declare your HashSet as: HashSet<string> valueCash = new HashSet<string>(new StringEqualityComparer())Metic
Dirty hack. I know how it works but its lazy just make it work kind of solutionWhitefish
M
2

Ok, so, you can do it like this

YourObject x = yourHashSet.Where(w => w.Name.Contains("strin")).FirstOrDefault();

This is to get a new Instance of the selected object. In order to update your object, then you should use:

yourHashSet.Where(w => w.Name.Contains("strin")).FirstOrDefault().MyProperty = "something";
Mansuetude answered 23/2, 2016 at 18:51 Comment(2)
This is an interesting way, just you need to wrap the second in a try - so that if you searching for something that is not in the list you will get a NullReferenceExpection . But its a step in the correct direction?Larghetto
LINQ traverses the collection in a foreach loop, i.e. O(n) lookup time. While it is a solution to the problem, it kind of defeats the purpose of using a HashSet in the first place.Whittle
P
2

Another Trick would do Reflection, by accessing the internal function InternalIndexOf of HashSet. Keep in mind the fieldnames are hardcoded, so if those change in upcoming .NET versions this will break.

Note: If you use Mono, you should change field name from m_slots to _slots.

internal static class HashSetExtensions<T>
{
    public delegate bool GetValue(HashSet<T> source, T equalValue, out T actualValue);

    public static GetValue TryGetValue { get; }

    static HashSetExtensions() {
        var targetExp = Expression.Parameter(typeof(HashSet<T>), "target");
        var itemExp   = Expression.Parameter(typeof(T), "item");
        var actualValueExp = Expression.Parameter(typeof(T).MakeByRefType(), "actualValueExp");

        var indexVar = Expression.Variable(typeof(int), "index");
        // ReSharper disable once AssignNullToNotNullAttribute
        var indexExp = Expression.Call(targetExp, typeof(HashSet<T>).GetMethod("InternalIndexOf", BindingFlags.NonPublic | BindingFlags.Instance), itemExp);

        var truePart = Expression.Block(
            Expression.Assign(
                actualValueExp, Expression.Field(
                    Expression.ArrayAccess(
                        // ReSharper disable once AssignNullToNotNullAttribute
                        Expression.Field(targetExp, typeof(HashSet<T>).GetField("m_slots", BindingFlags.NonPublic | BindingFlags.Instance)), indexVar),
                    "value")),
            Expression.Constant(true));

        var falsePart = Expression.Constant(false);

        var block = Expression.Block(
            new[] { indexVar },
            Expression.Assign(indexVar, indexExp),
            Expression.Condition(
                Expression.GreaterThanOrEqual(indexVar, Expression.Constant(0)),
                truePart,
                falsePart));

        TryGetValue = Expression.Lambda<GetValue>(block, targetExp, itemExp, actualValueExp).Compile();
    }
}

public static class Extensions
{
    public static bool TryGetValue2<T>(this HashSet<T> source, T equalValue,  out T actualValue) {
        if (source.Count > 0) {
            if (HashSetExtensions<T>.TryGetValue(source, equalValue, out actualValue)) {
                return true;
            }
        }
        actualValue = default;
        return false;
    }
}

Test:

var x = new HashSet<int> { 1, 2, 3 };
if (x.TryGetValue2(1, out var value)) {
    Console.WriteLine(value);
}
Priapitis answered 4/12, 2016 at 16:7 Comment(0)
D
2

Now .NET Core 2.0 has this exact method.

HashSet.TryGetValue(T, T) Method

Dorpat answered 29/3, 2018 at 0:41 Comment(0)
A
1

SortedSet would probably have O(log n) lookup time in that circumstance, if using that is an option. Still not O(1), but at least better.

Ally answered 13/10, 2011 at 21:5 Comment(0)
C
1

Modified implementation of @mp666 answer so it can be used for any type of HashSet and allows for overriding the default equality comparer.

public interface IRetainingComparer<T> : IEqualityComparer<T>
{
    T Key { get; }
    void ClearKeyCache();
}

/// <summary>
/// An <see cref="IEqualityComparer{T}"/> that retains the last key that successfully passed <see cref="IEqualityComparer{T}.Equals(T,T)"/>.
/// This class relies on the fact that <see cref="HashSet{T}"/> calls the <see cref="IEqualityComparer{T}.Equals(T,T)"/> with the first parameter
/// being an existing element and the second parameter being the one passed to the initiating call to <see cref="HashSet{T}"/> (eg. <see cref="HashSet{T}.Contains(T)"/>).
/// </summary>
/// <typeparam name="T">The type of object being compared.</typeparam>
/// <remarks>This class is thread-safe but may should not be used with any sort of parallel access (PLINQ).</remarks>
public class RetainingEqualityComparerObject<T> : IRetainingComparer<T> where T : class
{
    private readonly IEqualityComparer<T> _comparer;

    [ThreadStatic]
    private static WeakReference<T> _retained;

    public RetainingEqualityComparerObject(IEqualityComparer<T> comparer)
    {
        _comparer = comparer;
    }

    /// <summary>
    /// The retained instance on side 'a' of the <see cref="Equals"/> call which successfully met the equality requirement agains side 'b'.
    /// </summary>
    /// <remarks>Uses a <see cref="WeakReference{T}"/> so unintended memory leaks are not encountered.</remarks>
    public T Key
    {
        get
        {
            T retained;
            return _retained == null ? null : _retained.TryGetTarget(out retained) ? retained : null;
        }
    }


    /// <summary>
    /// Sets the retained <see cref="Key"/> to the default value.
    /// </summary>
    /// <remarks>This should be called prior to performing an operation that calls <see cref="Equals"/>.</remarks>
    public void ClearKeyCache()
    {
        _retained = _retained ?? new WeakReference<T>(null);
        _retained.SetTarget(null);
    }

    /// <summary>
    /// Test two objects of type <see cref="T"/> for equality retaining the object if successful.
    /// </summary>
    /// <param name="a">An instance of <see cref="T"/>.</param>
    /// <param name="b">A second instance of <see cref="T"/> to compare against <paramref name="a"/>.</param>
    /// <returns>True if <paramref name="a"/> and <paramref name="b"/> are equal, false otherwise.</returns>
    public bool Equals(T a, T b)
    {
        if (!_comparer.Equals(a, b))
        {
            return false;
        }

        _retained = _retained ?? new WeakReference<T>(null);
        _retained.SetTarget(a);
        return true;
    }

    /// <summary>
    /// Gets the hash code value of an instance of <see cref="T"/>.
    /// </summary>
    /// <param name="o">The instance of <see cref="T"/> to obtain a hash code from.</param>
    /// <returns>The hash code value from <paramref name="o"/>.</returns>
    public int GetHashCode(T o)
    {
        return _comparer.GetHashCode(o);
    }
}

/// <summary>
/// An <see cref="IEqualityComparer{T}"/> that retains the last key that successfully passed <see cref="IEqualityComparer{T}.Equals(T,T)"/>.
/// This class relies on the fact that <see cref="HashSet{T}"/> calls the <see cref="IEqualityComparer{T}.Equals(T,T)"/> with the first parameter
/// being an existing element and the second parameter being the one passed to the initiating call to <see cref="HashSet{T}"/> (eg. <see cref="HashSet{T}.Contains(T)"/>).
/// </summary>
/// <typeparam name="T">The type of object being compared.</typeparam>
/// <remarks>This class is thread-safe but may should not be used with any sort of parallel access (PLINQ).</remarks>
public class RetainingEqualityComparerStruct<T> : IRetainingComparer<T> where T : struct 
{
    private readonly IEqualityComparer<T> _comparer;

    [ThreadStatic]
    private static T _retained;

    public RetainingEqualityComparerStruct(IEqualityComparer<T> comparer)
    {
        _comparer = comparer;
    }

    /// <summary>
    /// The retained instance on side 'a' of the <see cref="Equals"/> call which successfully met the equality requirement agains side 'b'.
    /// </summary>
    public T Key => _retained;


    /// <summary>
    /// Sets the retained <see cref="Key"/> to the default value.
    /// </summary>
    /// <remarks>This should be called prior to performing an operation that calls <see cref="Equals"/>.</remarks>
    public void ClearKeyCache()
    {
        _retained = default(T);
    }

    /// <summary>
    /// Test two objects of type <see cref="T"/> for equality retaining the object if successful.
    /// </summary>
    /// <param name="a">An instance of <see cref="T"/>.</param>
    /// <param name="b">A second instance of <see cref="T"/> to compare against <paramref name="a"/>.</param>
    /// <returns>True if <paramref name="a"/> and <paramref name="b"/> are equal, false otherwise.</returns>
    public bool Equals(T a, T b)
    {
        if (!_comparer.Equals(a, b))
        {
            return false;
        }

        _retained = a;
        return true;
    }

    /// <summary>
    /// Gets the hash code value of an instance of <see cref="T"/>.
    /// </summary>
    /// <param name="o">The instance of <see cref="T"/> to obtain a hash code from.</param>
    /// <returns>The hash code value from <paramref name="o"/>.</returns>
    public int GetHashCode(T o)
    {
        return _comparer.GetHashCode(o);
    }
}

/// <summary>
/// Provides TryGetValue{T} functionality similar to that of <see cref="IDictionary{TKey,TValue}"/>'s implementation.
/// </summary>
public class ExtendedHashSet<T> : HashSet<T>
{
    /// <summary>
    /// This class is guaranteed to wrap the <see cref="IEqualityComparer{T}"/> with one of the <see cref="IRetainingComparer{T}"/>
    /// implementations so this property gives convenient access to the interfaced comparer.
    /// </summary>
    private IRetainingComparer<T> RetainingComparer => (IRetainingComparer<T>)Comparer;

    /// <summary>
    /// Creates either a <see cref="RetainingEqualityComparerStruct{T}"/> or <see cref="RetainingEqualityComparerObject{T}"/>
    /// depending on if <see cref="T"/> is a reference type or a value type.
    /// </summary>
    /// <param name="comparer">(optional) The <see cref="IEqualityComparer{T}"/> to wrap. This will be set to <see cref="EqualityComparer{T}.Default"/> if none provided.</param>
    /// <returns>An instance of <see cref="IRetainingComparer{T}"/>.</returns>
    private static IRetainingComparer<T> Create(IEqualityComparer<T> comparer = null)
    {
        return (IRetainingComparer<T>) (typeof(T).IsValueType ? 
            Activator.CreateInstance(typeof(RetainingEqualityComparerStruct<>)
                .MakeGenericType(typeof(T)), comparer ?? EqualityComparer<T>.Default)
            :
            Activator.CreateInstance(typeof(RetainingEqualityComparerObject<>)
                .MakeGenericType(typeof(T)), comparer ?? EqualityComparer<T>.Default));
    }

    public ExtendedHashSet() : base(Create())
    {
    }

    public ExtendedHashSet(IEqualityComparer<T> comparer) : base(Create(comparer))
    {
    }

    public ExtendedHashSet(IEnumerable<T> collection) : base(collection, Create())
    {
    }

    public ExtendedHashSet(IEnumerable<T> collection, IEqualityComparer<T> comparer) : base(collection, Create(comparer))
    {
    }

    /// <summary>
    /// Attempts to find a key in the <see cref="HashSet{T}"/> and, if found, places the instance in <paramref name="original"/>.
    /// </summary>
    /// <param name="value">The key used to search the <see cref="HashSet{T}"/>.</param>
    /// <param name="original">
    /// The matched instance from the <see cref="HashSet{T}"/> which is not neccessarily the same as <paramref name="value"/>.
    /// This will be set to null for reference types or default(T) for value types when no match found.
    /// </param>
    /// <returns>True if a key in the <see cref="HashSet{T}"/> matched <paramref name="value"/>, False if no match found.</returns>
    public bool TryGetValue(T value, out T original)
    {
        var comparer = RetainingComparer;
        comparer.ClearKeyCache();

        if (Contains(value))
        {
            original = comparer.Key;
            return true;
        }

        original = default(T);
        return false;
    }
}

public static class HashSetExtensions
{
    /// <summary>
    /// Attempts to find a key in the <see cref="HashSet{T}"/> and, if found, places the instance in <paramref name="original"/>.
    /// </summary>
    /// <param name="hashSet">The instance of <see cref="HashSet{T}"/> extended.</param>
    /// <param name="value">The key used to search the <see cref="HashSet{T}"/>.</param>
    /// <param name="original">
    /// The matched instance from the <see cref="HashSet{T}"/> which is not neccessarily the same as <paramref name="value"/>.
    /// This will be set to null for reference types or default(T) for value types when no match found.
    /// </param>
    /// <returns>True if a key in the <see cref="HashSet{T}"/> matched <paramref name="value"/>, False if no match found.</returns>
    /// <exception cref="ArgumentNullException">If <paramref name="hashSet"/> is null.</exception>
    /// <exception cref="ArgumentException">
    /// If <paramref name="hashSet"/> does not have a <see cref="HashSet{T}.Comparer"/> of type <see cref="IRetainingComparer{T}"/>.
    /// </exception>
    public static bool TryGetValue<T>(this HashSet<T> hashSet, T value, out T original)
    {
        if (hashSet == null)
        {
            throw new ArgumentNullException(nameof(hashSet));
        }

        if (hashSet.Comparer.GetType().IsInstanceOfType(typeof(IRetainingComparer<T>)))
        {
            throw new ArgumentException($"HashSet must have an equality comparer of type '{nameof(IRetainingComparer<T>)}' to use this functionality", nameof(hashSet));
        }

        var comparer = (IRetainingComparer<T>)hashSet.Comparer;
        comparer.ClearKeyCache();

        if (hashSet.Contains(value))
        {
            original = comparer.Key;
            return true;
        }

        original = default(T);
        return false;
    }
}
Cichocki answered 2/10, 2016 at 15:54 Comment(5)
Since you're using the Linq extension method Enumerable.Contains, it will enumerate all elements of the set and compare them, losing any benefits the hash implementation of the set provides. Then you might as well just write set.SingleOrDefault(e => set.Comparer.Equals(e, obj)), which has the same behavior and performance characteristics as your solution.Uncounted
@Virtlink Good catch -- You're absolutely right. I'll modify my answer.Cichocki
However, if you were to wrap a HashSet that uses your comparator internally, it would work. Like this: Utillib/ExtHashSetUncounted
@Virtlink thank you! I ended up wrapping HashSet as one option but providing the comparers and an extension method for additional versatility. It's now thread-safe and will not leak memory... but it's quite a bit more code than I had hoped!Cichocki
@Francois Writing the code above was more an exercise of figuring out an "optimal" time/memory solution; however, I don't suggest you go with this method. Using a Dictionary<T,T> with a custom IEqualityComparer is much more straight-forward and future-proof!Cichocki
M
-2

HashSet has a Contains(T) method.

You can specify an IEqualityComparer if you need a custom comparison method (e.g., store a person object, but use the SSN for equality comparison).

Maltzman answered 13/10, 2011 at 21:3 Comment(0)
C
-11

You can also use ToList() method and apply an indexer to that.

HashSet<string> mySet = new HashSet();
mySet.Add("mykey");
string key = mySet.toList()[0];
Casket answered 24/2, 2015 at 22:13 Comment(2)
I am not sure why you got down votes for when I applied this logic it worked. I needed to extract values from a structure that started with Dictionary<string,ISet<String>> where the ISet contained x number of values. The most direct way to get those values was to loop through the dictionary pulling the key and the ISet Value. Then I looped through the ISet to display the individual values. It is not elegant, but it worked.Nijinsky
Because someone would use HashSet to achieve O(1) complexity and ToList() make this method O(n)Confluent

© 2022 - 2024 — McMap. All rights reserved.