LINQ's Distinct() on a particular property
Asked Answered
B

23

1478

I am playing with LINQ to learn about it, but I can't figure out how to use Distinct when I do not have a simple list (a simple list of integers is pretty easy to do, this is not the question). What I if want to use Distinct on a List<TElement> on one or more properties of the TElement?

Example: If an object is Person, with property Id. How can I get all Person and use Distinct on them with the property Id of the object?

Person1: Id=1, Name="Test1"
Person2: Id=1, Name="Test1"
Person3: Id=2, Name="Test2"

How can I get just Person1 and Person3? Is that possible?

If it's not possible with LINQ, what would be the best way to have a list of Person depending on some of its properties?

Bovid answered 28/1, 2009 at 20:45 Comment(1)
Similar: Distinct not working with LINQ to Objects.Sackcloth
B
1534

EDIT: This is now part of MoreLINQ.

What you need is a "distinct-by" effectively. I don't believe it's part of LINQ as it stands, although it's fairly easy to write:

public static IEnumerable<TSource> DistinctBy<TSource, TKey>
    (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    HashSet<TKey> seenKeys = new HashSet<TKey>();
    foreach (TSource element in source)
    {
        if (seenKeys.Add(keySelector(element)))
        {
            yield return element;
        }
    }
}

So to find the distinct values using just the Id property, you could use:

var query = people.DistinctBy(p => p.Id);

And to use multiple properties, you can use anonymous types, which implement equality appropriately:

var query = people.DistinctBy(p => new { p.Id, p.Name });

Untested, but it should work (and it now at least compiles).

It assumes the default comparer for the keys though - if you want to pass in an equality comparer, just pass it on to the HashSet constructor.

Bodily answered 28/1, 2009 at 21:17 Comment(26)
This is a good solution, if you assume that when there are multiple non-distinct values (like in his example) you're aiming to return the first one that you see in the enumeration.Coronado
Yes, that's what I was assuming based on the question. If he'd requested Person2 and Person3 it would be harder :)Bodily
Jon, I have a serious question, probably on Linq design principles.. You are intaking an IEnumerable and iterating over it, doesn't that mean its not fitting to use this along with Linq, or in other words doesnt this spoil all the goodness of deferred execution? I would say taking an ICollection as the parameter is better since the caller would know directly that the Linq expression is executed and no magic/side effect take place inside the method. I sometimes feel the idea of mixing lazy execution along with C#'s otherwise imperative style is broken! :(Exegete
Or is it that, just because you have used yield statement, it prevents foreach from executing (the enumerable source) ?Exegete
@nawfal: No, it's exactly the opposite - it's because it's deferred and streaming that it fits in with how LINQ works. LINQ uses deferred execution all over the place. It would be very odd to require that it executed eagerly.Bodily
@JonSkeet Yes you are right, I just tested it. I didn't know that its the yield keyword that's helping "deferred execution" though there's a foreach enclosing it. Am I right if I say had there been no yield statement in your foreach, the foreach would have executed the source IEnumerable?Exegete
But I believe having both lazy and eager together in an environment makes it very easy for programmers to not benefit from advantages of the former. I often see IEnumerables taken as argument and iterated over, and that would be very bad had the enumerable been a left over of some linq queryExegete
@nawfal: It's hard to answer you if you keep adding two comments at a time. Also, comment threads aren't really ideal for this. If you want to know about the behaviour of yield, read the specification, experiment, and then ask a new question afterwards. It's not clear what you mean by a "left over" of some linq query... but anyone using LINQ needs to understand deferred execution. There's no getting around that.Bodily
I didn't DV, but I can see one reason why someone would -- you can't use additional libraries (like MoreLINQ) and need "raw" LINQ only.Nelly
@ashes999: I'm not sure what you mean. The code is present in the answer and in the library - depending on whether you're happy to take on a dependency.Bodily
@JonSkeet I used GroupBy since it's simpler: no extra variables (hashset) to declare. However, I do like the extensibility of being able to pass a comparer there.Nelly
@ashes999: If you're only doing this in a single place, ever, then sure, using GroupBy is simpler. If you need it in more than one place, it's much cleaner (IMO) to encapsulate the intention.Bodily
The loop could also be written more briefly as return source.Where(e => seenKeys.Add(keySelector(e))).Saxophone
@emodendroket: Only if you made it not an iterator block... at which point if you iterated over it twice, the second time you'd get an empty sequence. Not ideal.Bodily
DistinctBy is now part of Microsoft.Ajax.UtilitiiesNucleoprotein
this extension method should use the groupby pattern under the covers. This is just bad as it will only work in memory.Adynamia
@MatthewWhited: How do you think GroupBy works? (Hint: codeblog.jonskeet.uk/2011/01/01/…) This approach uses less memory, as it doesn't require all the duplicate elements to be retained.Bodily
I know how group by works. I also know that busing hashset against iqueryable would end up with everything working on the client side. This is not a good example plan and simple.Adynamia
@MatthewWhited: Given that there's no mention of IQueryable<T> here, I don't see how it's relevant. I agree that this wouldn't be suitable for EF etc, but within LINQ to Objects I think it's more suitable than GroupBy. The context of the question is always important.Bodily
The project moved on github, here's the code of DistinctBy: github.com/morelinq/MoreLINQ/blob/master/MoreLinq/DistinctBy.csParaphrast
Me: "What an elegant solution." [Notices the poster is @JonSkeet and smiles inwardly.]Yanina
I think this is a superior solution to the numerous GroupBy()/group by/ToLookup() answers because, like Distinct(), this is able to yield an element as soon as it's encountered (the first time), whereas those other methods can't return anything until the entire input sequence has been consumed. I think that's an important, er, distinction worth pointing out in the answer. Also, as far as memory, by the final element this HashSet<> will be storing only unique elements, whereas the other methods will somewhere be storing unique groups with unique + duplicates elements.Molini
I've used this solution a few times, and I really like it. I would like to add though. That you should only use this solution if you have already implemented IEquatable on the objects you are interested in but you need distinctness that the .Equals doesn't provide. Default solution should be to implement IEquatable and just use the built in .DistinctHying
@Tenderdude: Not everything naturally has anything to be equal on. I don't typically implement IEquatable<T> unless there's some natural idea of equality to work with beyond the default reference equalityBodily
@JonSkeet Did you by the way wrote the DistinctBy in Microsoft.Ajax.Utilities.AjaxMinExtensions in WebGrease? It's very similar to this, but instead of yielding does this return from p in source where ((HashSet<TKey>)hash).Add(keySelector(p)) select p;Bilicki
@alfoks: No, I haven't written anything in any Microsoft package. I'd expect the two to basically be equivalent.Bodily
A
2425

What if I want to obtain a distinct list based on one or more properties?

Simple! You want to group them and pick a winner out of the group.

List<Person> distinctPeople = allPeople
  .GroupBy(p => p.PersonId)
  .Select(g => g.First())
  .ToList();

If you want to define groups on multiple properties, here's how:

List<Person> distinctPeople = allPeople
  .GroupBy(p => new {p.PersonId, p.FavoriteColor} )
  .Select(g => g.First())
  .ToList();

Note: Certain query providers are unable to resolve that each group must have at least one element, and that First is the appropriate method to call in that situation. If you find yourself working with such a query provider, FirstOrDefault may help get your query through the query provider.

Note2: Consider this answer for an EF Core (prior to EF Core 6) compatible approach. https://mcmap.net/q/30318/-error-while-flattening-the-iqueryable-lt-t-gt-after-groupby

Alexandrina answered 29/1, 2009 at 14:39 Comment(13)
@ErenErsonmez sure. With my posted code, if deferred execution is desired, leave off the ToList call.Alexandrina
Very nice answer! Realllllly helped me in Linq-to-Entities driven from a sql view where I couldn't modify the view. I needed to use FirstOrDefault() rather than First() - all is good.Ayers
How about calling IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer)?Impudicity
I tried it and it should change to Select(g => g.FirstOrDefault())Lawrenson
I tried that approach but it's not working for me. I have a class with multiple properties, I want to return a list of different type (3 properties only) with distinct values. I tried var possibleValues = _matrixTemplateValuesAdapter.GetAll().GroupBy(v => new ValueAttribute { ValueName = v.ValueName, AttributeId = v.AtnameId, Hidden = v.Hidden }).Select(g => g.First()) .ToList(); but it doesn't select distinct new valuesArgus
This should be the real accepted answer IMO. In most cases you would be good with the groupBy. No need for another lib for doing that.Saintmihiel
Wondering if shouldn't have used SingleOrDefault() instead of First().Sarre
@ChocapicSz Nope. Both Single() and SingleOrDefault() each throw when the source has more than one item. In this operation, we expect the possibility that each group may have more then one item. For that matter, First() is preferred over FirstOrDefault() because each group must have at least one member.... unless you're using EntityFramework, which can't figure out that each group has at least one member and demands FirstOrDefault().Alexandrina
I have rolled back the edit which changes from First to FirstOrDefault. LinqToEntities is not the only use of Linq queries and we should not limit our understanding based on its quirks.Alexandrina
Worth to mention asymptotic cost of this approach is O(n). Based on: #4890169Subassembly
Seems to not be currently supported in EF Core, even using FirstOrDefault() github.com/dotnet/efcore/issues/12088 I am on 3.1, and I get "unable to translate" errors.Molokai
If you need an EF compatible translation, try the one in this answer. https://mcmap.net/q/30318/-error-while-flattening-the-iqueryable-lt-t-gt-after-groupby Don't forget to give Marta an upvote for asking the question that caused this answer to exist.Alexandrina
EFCore6 will translate this.Alexandrina
B
1534

EDIT: This is now part of MoreLINQ.

What you need is a "distinct-by" effectively. I don't believe it's part of LINQ as it stands, although it's fairly easy to write:

public static IEnumerable<TSource> DistinctBy<TSource, TKey>
    (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    HashSet<TKey> seenKeys = new HashSet<TKey>();
    foreach (TSource element in source)
    {
        if (seenKeys.Add(keySelector(element)))
        {
            yield return element;
        }
    }
}

So to find the distinct values using just the Id property, you could use:

var query = people.DistinctBy(p => p.Id);

And to use multiple properties, you can use anonymous types, which implement equality appropriately:

var query = people.DistinctBy(p => new { p.Id, p.Name });

Untested, but it should work (and it now at least compiles).

It assumes the default comparer for the keys though - if you want to pass in an equality comparer, just pass it on to the HashSet constructor.

Bodily answered 28/1, 2009 at 21:17 Comment(26)
This is a good solution, if you assume that when there are multiple non-distinct values (like in his example) you're aiming to return the first one that you see in the enumeration.Coronado
Yes, that's what I was assuming based on the question. If he'd requested Person2 and Person3 it would be harder :)Bodily
Jon, I have a serious question, probably on Linq design principles.. You are intaking an IEnumerable and iterating over it, doesn't that mean its not fitting to use this along with Linq, or in other words doesnt this spoil all the goodness of deferred execution? I would say taking an ICollection as the parameter is better since the caller would know directly that the Linq expression is executed and no magic/side effect take place inside the method. I sometimes feel the idea of mixing lazy execution along with C#'s otherwise imperative style is broken! :(Exegete
Or is it that, just because you have used yield statement, it prevents foreach from executing (the enumerable source) ?Exegete
@nawfal: No, it's exactly the opposite - it's because it's deferred and streaming that it fits in with how LINQ works. LINQ uses deferred execution all over the place. It would be very odd to require that it executed eagerly.Bodily
@JonSkeet Yes you are right, I just tested it. I didn't know that its the yield keyword that's helping "deferred execution" though there's a foreach enclosing it. Am I right if I say had there been no yield statement in your foreach, the foreach would have executed the source IEnumerable?Exegete
But I believe having both lazy and eager together in an environment makes it very easy for programmers to not benefit from advantages of the former. I often see IEnumerables taken as argument and iterated over, and that would be very bad had the enumerable been a left over of some linq queryExegete
@nawfal: It's hard to answer you if you keep adding two comments at a time. Also, comment threads aren't really ideal for this. If you want to know about the behaviour of yield, read the specification, experiment, and then ask a new question afterwards. It's not clear what you mean by a "left over" of some linq query... but anyone using LINQ needs to understand deferred execution. There's no getting around that.Bodily
I didn't DV, but I can see one reason why someone would -- you can't use additional libraries (like MoreLINQ) and need "raw" LINQ only.Nelly
@ashes999: I'm not sure what you mean. The code is present in the answer and in the library - depending on whether you're happy to take on a dependency.Bodily
@JonSkeet I used GroupBy since it's simpler: no extra variables (hashset) to declare. However, I do like the extensibility of being able to pass a comparer there.Nelly
@ashes999: If you're only doing this in a single place, ever, then sure, using GroupBy is simpler. If you need it in more than one place, it's much cleaner (IMO) to encapsulate the intention.Bodily
The loop could also be written more briefly as return source.Where(e => seenKeys.Add(keySelector(e))).Saxophone
@emodendroket: Only if you made it not an iterator block... at which point if you iterated over it twice, the second time you'd get an empty sequence. Not ideal.Bodily
DistinctBy is now part of Microsoft.Ajax.UtilitiiesNucleoprotein
this extension method should use the groupby pattern under the covers. This is just bad as it will only work in memory.Adynamia
@MatthewWhited: How do you think GroupBy works? (Hint: codeblog.jonskeet.uk/2011/01/01/…) This approach uses less memory, as it doesn't require all the duplicate elements to be retained.Bodily
I know how group by works. I also know that busing hashset against iqueryable would end up with everything working on the client side. This is not a good example plan and simple.Adynamia
@MatthewWhited: Given that there's no mention of IQueryable<T> here, I don't see how it's relevant. I agree that this wouldn't be suitable for EF etc, but within LINQ to Objects I think it's more suitable than GroupBy. The context of the question is always important.Bodily
The project moved on github, here's the code of DistinctBy: github.com/morelinq/MoreLINQ/blob/master/MoreLinq/DistinctBy.csParaphrast
Me: "What an elegant solution." [Notices the poster is @JonSkeet and smiles inwardly.]Yanina
I think this is a superior solution to the numerous GroupBy()/group by/ToLookup() answers because, like Distinct(), this is able to yield an element as soon as it's encountered (the first time), whereas those other methods can't return anything until the entire input sequence has been consumed. I think that's an important, er, distinction worth pointing out in the answer. Also, as far as memory, by the final element this HashSet<> will be storing only unique elements, whereas the other methods will somewhere be storing unique groups with unique + duplicates elements.Molini
I've used this solution a few times, and I really like it. I would like to add though. That you should only use this solution if you have already implemented IEquatable on the objects you are interested in but you need distinctness that the .Equals doesn't provide. Default solution should be to implement IEquatable and just use the built in .DistinctHying
@Tenderdude: Not everything naturally has anything to be equal on. I don't typically implement IEquatable<T> unless there's some natural idea of equality to work with beyond the default reference equalityBodily
@JonSkeet Did you by the way wrote the DistinctBy in Microsoft.Ajax.Utilities.AjaxMinExtensions in WebGrease? It's very similar to this, but instead of yielding does this return from p in source where ((HashSet<TKey>)hash).Add(keySelector(p)) select p;Bilicki
@alfoks: No, I haven't written anything in any Microsoft package. I'd expect the two to basically be equivalent.Bodily
M
128

Use:

List<Person> pList = new List<Person>();
/* Fill list */

var result = pList.Where(p => p.Name != null).GroupBy(p => p.Id)
    .Select(grp => grp.FirstOrDefault());

The where helps you filter the entries (could be more complex) and the groupby and select perform the distinct function.

Malvina answered 14/2, 2012 at 12:52 Comment(1)
Perfect, and works without extending Linq or using another dependency.Coastguardsman
W
90

You could also use query syntax if you want it to look all LINQ-like:

var uniquePeople = from p in people
                   group p by new {p.ID} //or group by new {p.ID, p.Name, p.Whatever}
                   into mygroup
                   select mygroup.FirstOrDefault();
Waisted answered 6/3, 2012 at 18:28 Comment(3)
Hmm my thoughts are both the query syntax and the fluent API syntax are just as LINQ like as each other and its just preference over which ones people use. I myself prefer the fluent API so I would consider that more LINK-Like but then I guess that's subjectiveMcclung
LINQ-Like has nothing to do with preference, being "LINQ-like" has to do with looking like a different query language being embedded into C#, I prefer the fluent interface, coming from java streams, but it is NOT LINQ-Like.Alvinia
Excellent!! You are my hero!Cappadocia
I
88

I think it is enough:

list.Select(s => s.MyField).Distinct();
Idden answered 23/1, 2015 at 14:54 Comment(3)
What if he needs back his full object, not just that particular field?Wellfound
What exactly object of the several objects that have the same property value?Chalcopyrite
This is exactly same with sql syntax for "select distinct MyField from MyTable". Perfect solution if you dont need full object which is not expected when you use distinct.Bituminize
U
83

Starting with .NET 6, there is new solution using the new DistinctBy() extension in Linq, so we can do:

var distinctPersonsById = personList.DistinctBy(x => x.Id);

The signature of the DistinctBy method:

// Returns distinct elements from a sequence according to a specified
// key selector function.
public static IEnumerable<TSource> DistinctBy<TSource, TKey> (
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector);
Underplay answered 1/6, 2021 at 12:6 Comment(1)
keeping up with latest .NETAllium
P
67

Solution first group by your fields then select FirstOrDefault item.

List<Person> distinctPeople = allPeople
.GroupBy(p => p.PersonId)
.Select(g => g.FirstOrDefault())
.ToList();
Possessive answered 13/7, 2017 at 8:33 Comment(0)
R
35

You can do this with the standard Linq.ToLookup(). This will create a collection of values for each unique key. Just select the first item in the collection

Persons.ToLookup(p => p.Id).Select(coll => coll.First());
Reflect answered 20/1, 2015 at 15:1 Comment(0)
E
21

The following code is functionally equivalent to Jon Skeet's answer.

Tested on .NET 4.5, should work on any earlier version of LINQ.

public static IEnumerable<TSource> DistinctBy<TSource, TKey>(
  this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
  HashSet<TKey> seenKeys = new HashSet<TKey>();
  return source.Where(element => seenKeys.Add(keySelector(element)));
}

Incidentially, check out Jon Skeet's latest version of DistinctBy.cs on Google Code.

Update 2022-04-03

Based on an comment by Andrew McClement, best to take John Skeet's answer over this one.

Elonore answered 6/2, 2013 at 19:56 Comment(3)
This gave me a "sequence has no values error", but Skeet's answer produced the correct result.Morea
To clarify why this is not equivalent to Jon Skeet's answer - the difference only happens if you reuse the same enumerable. If you reuse the enumerable from this answer, the HashSet is already filled, so no elements are returned (all keys have been seen). For Skeet's answer, since it uses yield return, it creates a new HashSet every time the enumerable is iterated.Okeechobee
@AndrewMcClement Agree. Updated answer.Elonore
N
13

I've written an article that explains how to extend the Distinct function so that you can do as follows:

var people = new List<Person>();

people.Add(new Person(1, "a", "b"));
people.Add(new Person(2, "c", "d"));
people.Add(new Person(1, "a", "b"));

foreach (var person in people.Distinct(p => p.ID))
    // Do stuff with unique list here.

Here's the article (now in the Web Archive): Extending LINQ - Specifying a Property in the Distinct Function

Norvil answered 11/3, 2009 at 12:21 Comment(2)
Your article has an error, there should be a <T> after Distinct: public static IEnumerable<T> Distinct(this... Also it does not look like it will work (nicely) on more that one property i.e. a combination of first and last names.Brunhilda
Please, don't post the relevant information in external link, an answer must stand on its own. It's ok to post the link, but please, copy the relevant info to the answer itself. You only posted an usage example, but without the external resource it's useless.Underplay
S
10

Personally I use the following class:

public class LambdaEqualityComparer<TSource, TDest> : 
    IEqualityComparer<TSource>
{
    private Func<TSource, TDest> _selector;

    public LambdaEqualityComparer(Func<TSource, TDest> selector)
    {
        _selector = selector;
    }

    public bool Equals(TSource obj, TSource other)
    {
        return _selector(obj).Equals(_selector(other));
    }

    public int GetHashCode(TSource obj)
    {
        return _selector(obj).GetHashCode();
    }
}

Then, an extension method:

public static IEnumerable<TSource> Distinct<TSource, TCompare>(
    this IEnumerable<TSource> source, Func<TSource, TCompare> selector)
{
    return source.Distinct(new LambdaEqualityComparer<TSource, TCompare>(selector));
}

Finally, the intended usage:

var dates = new List<DateTime>() { /* ... */ }
var distinctYears = dates.Distinct(date => date.Year);

The advantage I found using this approach is the re-usage of LambdaEqualityComparer class for other methods that accept an IEqualityComparer. (Oh, and I leave the yield stuff to the original LINQ implementation...)

Sepalous answered 30/10, 2015 at 18:59 Comment(0)
C
10

You can use DistinctBy() for getting Distinct records by an object property. Just add the following statement before using it:

using Microsoft.Ajax.Utilities;

and then use it like following:

var listToReturn = responseList.DistinctBy(x => x.Index).ToList();

where 'Index' is the property on which i want the data to be distinct.

Cacoepy answered 27/3, 2019 at 6:4 Comment(1)
See this answer for where to find the dll.Sanctimony
C
5

You can do it (albeit not lightning-quickly) like so:

people.Where(p => !people.Any(q => (p != q && p.Id == q.Id)));

That is, "select all people where there isn't another different person in the list with the same ID."

Mind you, in your example, that would just select person 3. I'm not sure how to tell which you want, out of the previous two.

Coronado answered 28/1, 2009 at 20:47 Comment(0)
S
5

In case you need a Distinct method on multiple properties, you can check out my PowerfulExtensions library. Currently it's in a very young stage, but already you can use methods like Distinct, Union, Intersect, Except on any number of properties;

This is how you use it:

using PowerfulExtensions.Linq;
...
var distinct = myArray.Distinct(x => x.A, x => x.B);
Sibyl answered 15/8, 2013 at 20:20 Comment(0)
U
5

When we faced such a task in our project we defined a small API to compose comparators.

So, the use case was like this:

var wordComparer = KeyEqualityComparer.Null<Word>().
    ThenBy(item => item.Text).
    ThenBy(item => item.LangID);
...
source.Select(...).Distinct(wordComparer);

And API itself looks like this:

using System;
using System.Collections;
using System.Collections.Generic;

public static class KeyEqualityComparer
{
    public static IEqualityComparer<T> Null<T>()
    {
        return null;
    }

    public static IEqualityComparer<T> EqualityComparerBy<T, K>(
        this IEnumerable<T> source,
        Func<T, K> keyFunc)
    {
        return new KeyEqualityComparer<T, K>(keyFunc);
    }

    public static KeyEqualityComparer<T, K> ThenBy<T, K>(
        this IEqualityComparer<T> equalityComparer,
        Func<T, K> keyFunc)
    {
        return new KeyEqualityComparer<T, K>(keyFunc, equalityComparer);
    }
}

public struct KeyEqualityComparer<T, K>: IEqualityComparer<T>
{
    public KeyEqualityComparer(
        Func<T, K> keyFunc,
        IEqualityComparer<T> equalityComparer = null)
    {
        KeyFunc = keyFunc;
        EqualityComparer = equalityComparer;
    }

    public bool Equals(T x, T y)
    {
        return ((EqualityComparer == null) || EqualityComparer.Equals(x, y)) &&
                EqualityComparer<K>.Default.Equals(KeyFunc(x), KeyFunc(y));
    }

    public int GetHashCode(T obj)
    {
        var hash = EqualityComparer<K>.Default.GetHashCode(KeyFunc(obj));

        if (EqualityComparer != null)
        {
            var hash2 = EqualityComparer.GetHashCode(obj);

            hash ^= (hash2 << 5) + hash2;
        }

        return hash;
    }

    public readonly Func<T, K> KeyFunc;
    public readonly IEqualityComparer<T> EqualityComparer;
}

More details is on our site: IEqualityComparer in LINQ.

Untruthful answered 10/7, 2014 at 21:0 Comment(0)
M
4

If you don't want to add the MoreLinq library to your project just to get the DistinctBy functionality then you can get the same end result using the overload of Linq's Distinct method that takes in an IEqualityComparer argument.

You begin by creating a generic custom equality comparer class that uses lambda syntax to perform custom comparison of two instances of a generic class:

public class CustomEqualityComparer<T> : IEqualityComparer<T>
{
    Func<T, T, bool> _comparison;
    Func<T, int> _hashCodeFactory;

    public CustomEqualityComparer(Func<T, T, bool> comparison, Func<T, int> hashCodeFactory)
    {
        _comparison = comparison;
        _hashCodeFactory = hashCodeFactory;
    }

    public bool Equals(T x, T y)
    {
        return _comparison(x, y);
    }

    public int GetHashCode(T obj)
    {
        return _hashCodeFactory(obj);
    }
}

Then in your main code you use it like so:

Func<Person, Person, bool> areEqual = (p1, p2) => int.Equals(p1.Id, p2.Id);

Func<Person, int> getHashCode = (p) => p.Id.GetHashCode();

var query = people.Distinct(new CustomEqualityComparer<Person>(areEqual, getHashCode));

Voila! :)

The above assumes the following:

  • Property Person.Id is of type int
  • The people collection does not contain any null elements

If the collection could contain nulls then simply rewrite the lambdas to check for null, e.g.:

Func<Person, Person, bool> areEqual = (p1, p2) => 
{
    return (p1 != null && p2 != null) ? int.Equals(p1.Id, p2.Id) : false;
};

EDIT

This approach is similar to the one in Vladimir Nesterovsky's answer but simpler.

It is also similar to the one in Joel's answer but allows for complex comparison logic involving multiple properties.

However, if your objects can only ever differ by Id then another user gave the correct answer that all you need to do is override the default implementations of GetHashCode() and Equals() in your Person class and then just use the out-of-the-box Distinct() method of Linq to filter out any duplicates.

Motivate answered 22/8, 2016 at 17:45 Comment(1)
I want to get only unique items in dictonary, Can you please help, I am using this code If TempDT IsNot Nothing Then m_ConcurrentScriptDictionary = TempDT.AsEnumerable.ToDictionary(Function(x) x.SafeField(fldClusterId, NULL_ID_VALUE), Function(y) y.SafeField(fldParamValue11, NULL_ID_VALUE))Odeen
A
4

Override Equals(object obj) and GetHashCode() methods:

class Person
{
    public int Id { get; set; }
    public int Name { get; set; }

    public override bool Equals(object obj)
    {
        return ((Person)obj).Id == Id;
        // or: 
        // var o = (Person)obj;
        // return o.Id == Id && o.Name == Name;
    }
    public override int GetHashCode()
    {
        return Id.GetHashCode();
    }
}

and then just call:

List<Person> distinctList = new[] { person1, person2, person3 }.Distinct().ToList();
Argyle answered 27/9, 2018 at 20:31 Comment(3)
However GetHashCode() should be more advanced (to count also the Name), this answer is probably best by my opinion. Actually, to archive the target logic, there no need to override the GetHashCode(), Equals() is enough, but if we need performance, we have to override it. All comparison algs, first check hash, and if they are equal then call Equals().Talbott
Also, there in Equals() the first line should be "if (!(obj is Person)) return false". But best practice is to use separate object casted to a type, like "var o = obj as Person;if (o==null) return false;" then check equality with o without castingTalbott
Overriding Equals like this is not a good idea as it could have unintended consequences for other programmers expecting the Person's Equality to be determined on more than a single property.Convertiplane
F
2

The best way to do this that will be compatible with other .NET versions is to override Equals and GetHash to handle this (see Stack Overflow question This code returns distinct values. However, what I want is to return a strongly typed collection as opposed to an anonymous type), but if you need something that is generic throughout your code, the solutions in this article are great.

Farquhar answered 21/10, 2013 at 0:47 Comment(0)
S
1
List<Person>lst=new List<Person>
        var result1 = lst.OrderByDescending(a => a.ID).Select(a =>new Player {ID=a.ID,Name=a.Name} ).Distinct();
Subjugate answered 16/5, 2016 at 10:42 Comment(1)
Did you mean to Select() new Person instead of new Player? The fact that you are ordering by ID doesn't somehow inform Distinct() to use that property in determining uniqueness, though, so this won't work.Molini
F
0

You should be able to override Equals on person to actually do Equals on Person.id. This ought to result in the behavior you're after.

Fidget answered 28/1, 2009 at 20:49 Comment(1)
I wouldn't recommend this approach. While it might work in this specific case, it's simply bad practice. What if he wants to distinct by a different property somewhere else? For sure he can't override Equals twice, can he? :-) Apart from that, it's fundamentally wrong to override equals for this purpose, since it's meant to tell whether two objects are equal or not. If the classes condition for equality changes for any reason, you will burn your fingers for sure...Charlatanism
M
0

If you use old .NET version, where the extension method is not built-in, then you may define your own extension method:

public static class EnumerableExtensions
{
    public static IEnumerable<T> DistinctBy<T, TKey>(this IEnumerable<T> enumerable, Func<T, TKey> keySelector)
    {
        return enumerable.GroupBy(keySelector).Select(grp => grp.First());
    }
}

Example of usage:

var personsDist = persons.DistinctBy(item => item.Name);
Manhole answered 28/7, 2021 at 20:6 Comment(3)
How does this improve the accepted answer that offers the same extension method, slightly differently implemented?Sackcloth
It's shorter at least. And it's not slightly, it's differently implemented.Manhole
And not better. The accepted answer is much better. Why offer an inferior solution? New answers to old questions are supposed to be significant improvements to what's already there.Sackcloth
O
0

Definitely not the most efficient but for those, who are looking for a short and simple answer:

list.Select(x => x.Id).Distinct().Select(x => list.First(y => x == y.Id)).ToList();
Oskar answered 19/2, 2023 at 20:21 Comment(0)
D
-4

Please give a try with below code.

var Item = GetAll().GroupBy(x => x .Id).ToList();
Dyadic answered 16/7, 2018 at 5:26 Comment(1)
A short answer is welcome, however it won't provide much value to the latter users who are trying to understand what's going on behind the problem. Please spare some time to explain what's the real issue to cause the problem and how to solve. Thank you ~Venus

© 2022 - 2024 — McMap. All rights reserved.