Sort a list from another list IDs
Asked Answered
M

5

225

I have a list with some identifiers like this:

List<long> docIds = new List<long>() { 6, 1, 4, 7, 2 };

Morover, I have another list of <T> items, which are represented by the ids described above.

List<T> docs = GetDocsFromDb(...)

I need to keep the same order in both collections, so that the items in List<T> must be in the same position than in the first one (due to search engine scoring reasons). And this process cannot be done in the GetDocsFromDb() function.

If necessary, it's possible to change the second list into some other structure (Dictionary<long, T> for example), but I'd prefer not to change it.

Is there any simple and efficient way to do this "ordenation depending on some IDs" with LINQ?

Money answered 7/3, 2013 at 15:36 Comment(2)
are you assured that every docId occurs exactly once in docs, what property will hold the Id or will a selector Func<T, long> be required?Maneating
Does the first list represent a "master list"? Another words, will the second list be a subset representing a portion (or the entirety) of the first list?Noninterference
W
481
docs = docs.OrderBy(d => docsIds.IndexOf(d.Id)).ToList();
Whisler answered 7/3, 2013 at 15:42 Comment(7)
@Kaf thats why I upvoted too, does rely on knowing the document ID property is called Id. Its not specified in the question.Maneating
@BorjaLópez, a quick note. You mention efficiency in your question. IndexOf is perfectly acceptable for your example and nice and simple. If you had a lot of data my answer might be better suited. #3663514Maneating
I've benchmarked this method and the one with the Dictionary (see below) and it's almost twice as fast.Antimagnetic
@DenysDenysenko Fantastic. Thanks so much; exactly what I was looking for.Arista
does not work if you have elements in docs that do not have ids in the ordering listOctavo
Quite inefficient - IndexOf is being called for every element in the source collection and OrderBy has to order the elements. The solution by @Maneating is much faster.Scifi
@DanHunex if you need the non existing ids to appear at the end of the list you can do: .OrderBy(d=> { var index = docIds.IndexOf(d.Id); if (index == -1) index = docIds.Count; return index; })Late
M
52

Since you don't specify T,

public static IEnumerable<T> OrderBySequence<T, TId>(
       this IEnumerable<T> source,
       IEnumerable<TId> order,
       Func<T, TId> idSelector)
{
    var lookup = source.ToDictionary(idSelector, t => t);
    foreach (var id in order)
    {
        yield return lookup[id];
    }
}

Is a generic extension for what you want.

You could use the extension like this perhaps,

var orderDocs = docs.OrderBySequence(docIds, doc => doc.Id);

A safer version might be

public static IEnumerable<T> OrderBySequence<T, TId>(
       this IEnumerable<T> source,
       IEnumerable<TId> order,
       Func<T, TId> idSelector)
{
    var lookup = source.ToLookup(idSelector, t => t);
    foreach (var id in order)
    {
        foreach (var t in lookup[id])
        {
           yield return t;
        }
    }
}

which will work if source does not zip exactly with order.

Maneating answered 7/3, 2013 at 15:53 Comment(4)
I used this solution and it worked. Just that, i had to make the method static and the class static.Fifine
10 years old (almost) and still a wonderful solution. tnx (likewise had to embed in a static helper class but does offer great reuse)Shend
@jimtollan, Nice of you to say so. I guess it is also a good reflection on Stack Overflow. You should bear in mind that Join works too. Sort a list from another list IDsManeating
+1 more versatile, e.g : EmployeeList.OrderBySequence(HospitalList.OrderBy(h => h.Position).Select(h => h.Id), e => e.HospitalId) gives : employees order by their hospital itself order by its positionHeaps
T
32

Jodrell's answer is best, but actually he reimplemented System.Linq.Enumerable.Join. Join also uses Lookup and keeps ordering of source.

    docIds.Join(
      docs,
      i => i,
      d => d.Id,
      (i, d) => d);
Trevar answered 12/4, 2019 at 11:11 Comment(2)
That is the answer we are looking forSurge
This just proves that Join is too difficult to understand since everyone agreed that rewriting it was easier.Tomlinson
A
0

I think the given solution is incorrect because it loses all items that are not included in the order collection. The correct solution would be:

public static IEnumerable<T> OrderBySequence<T, TId>(this IEnumerable<T> source, IEnumerable<TId> order, Func<T, TId> idSelector)
{
    var lookup = source.ToLookup(idSelector, t => t);
    var included = new HashSet<TId>();

    foreach (var id in order)
    {
        if (!lookup.Contains(id)) continue;

        foreach (var t in lookup[id])
        {
            yield return t;
            included.Add(id);
        }
    }

    foreach (var item in source)
    {
        if (!included.Contains(idSelector(item)))
        {
            yield return item;
        }
    }
}
Annadiane answered 4/4, 2024 at 8:56 Comment(0)
C
-4

One simple approach is to zip with the ordering sequence:

List<T> docs = GetDocsFromDb(...).Zip(docIds, Tuple.Create)
               .OrderBy(x => x.Item2).Select(x => x.Item1).ToList();
Concave answered 7/3, 2013 at 15:44 Comment(3)
Because Zip combines each index (into a Tuple) with the document in the same position in the corresponding list. Then the OrderBy sorts the Tuples by the index part and then the select digs our just the docs from the now ordered list.Concave
but, the result of GetDocsFromDb is unordered so you'll be creating Tuples where Item1 is unrelated to Item2.Maneating
I think this will produce incorrect results since ordering performing uppon id and not the index.Antimagnetic

© 2022 - 2025 — McMap. All rights reserved.