Sort nested dictionary by value, and remainder by another value, in Python
Asked Answered
W

5

53

Consider this dictionary format.

{'KEY1':{'name':'google','date':20100701,'downloads':0},
 'KEY2':{'name':'chrome','date':20071010,'downloads':0},
 'KEY3':{'name':'python','date':20100710,'downloads':100}}

I'd like the dictionary sorted by downloads first, and then all items with no downloads sorted by date. Obviously a dictionary cannot be sorted, I just need a sorted listed of keys I can iterate over.

['KEY3','KEY1','KEY2']

I can already sort the list by either value using sorted, but how do I sort by second value too?

Whoopee answered 5/11, 2010 at 22:35 Comment(0)
D
74

Use the key argument for sorted(). It lets you specify a function that, given the actual item being sorted, returns a value that should be sorted by. If this value is a tuple, then it sorts like tuples sort - by the first value, and then by the second value.

sorted(your_list, key=lambda x: (your_dict[x]['downloads'], your_dict[x]['date']))
Diplostemonous answered 5/11, 2010 at 22:43 Comment(6)
Very nice line! Here is how you can reverse the order of the sorting: just add reverse=True as a parameter to the sorted function, at the end. ThanksFantastic
Change to the above one. your_list = sorted(your_dict, key=lambda x: (your_dict[x]['downloads'], your_dict[x]['date']))Blimey
Hi, I am getting an error TypeError: unhashable type: 'dict' when calling like: sorted(d.items(), key=lambda x: (d[x]['downloads'], d[x]['date'])), where d is equal to the dictionary in the OP. Please if anyone knows the problem, let me know. ThanksHenhouse
@Henhouse use d.keys() not d.items()Diplostemonous
This just returns keys, is there a way to get keys and the items you're sorting by? For example, is there a way to do sorted((d.keys(), lambda y: d[y]['downloads']), key=lambda x: (d[x]['downloads']) that's actually correct? That code as is is throwing me the same TypeError @Henhouse gotSexennial
sorted out. Thx.Hokku
T
21

You can pass a key function to sorted which returns a tuple containing the two things you wish to sort on. Assuming that your big dictionary is called d:

def keyfunc(tup):
    key, d = tup
    return d["downloads"], d["date"]

items = sorted(d.items(), key = keyfunc)

You can do this with a lambda if you prefer, but this is probably more clear. Here's the equivalent lambda-based code:

items = sorted(d.items(), key = lambda tup: (tup[1]["downloads"], tup[1]["date"]))

Incidentally, since you mentioned that you wanted to sort by "downloads" first, the above two examples sort according to download counts in ascending order. However, from context it sounds like you might want to sort in decreasing order of downloads, in which case you'd say

return -d["downloads"], d["date"]

in your keyfunc. If you wanted something like sorting in ascending order for non-zero download numbers, then having all zero-download records after that, you could say something like

return (-d["downloads"] or sys.maxint), d["date"]
Theatricalize answered 5/11, 2010 at 22:42 Comment(2)
I tried both, but instead of a list of keys I got a list of tuples. I can use this too but it's unnecessary complexity. [('KEY3',{'name':'python','date':20100710,'downloads':100})]Whoopee
@Whoopee You can get the keys from the tuples with [x[0] for x in items].Conservation
P
3
your_dict = dict(sorted(your_dict.items(), key = lambda x: (x[1]["downloads"], x[1]["date"])))
Posset answered 21/5, 2021 at 3:15 Comment(1)
While this answer may answer the question it would be good if you explained the code a bit, e.g. why it is built that way, how it works etc.Chaille
T
2

My other answer was wrong (as are most of the answers here)

sorted_keys = sorted((key for key in outer_dict if outer_dict[key]['downloads']),
                     key=lambda x: (outer_dict[key]['downloads'],
                                    outer_dict[key]['downloads'])
                     reverse=True)

sorted_keys += sorted((key for key in outer_dict if not outer_dict[key]['downloads']),
                      key=lambda x: outer_dict[key]['date'])

This will create a list with the items that have been downloaded sorted in descending order at the front of it and the rest of the items that have not been downloaded sorted by date after those that have.

But actually, the last part of Eli Courtwrights answer is the best.

Theriot answered 5/11, 2010 at 22:42 Comment(3)
Er, why are you import-ing operator and then never using it?Diplostemonous
@Diplostemonous because I thought I was going to do something else and then it started raining. I was in a rush. Inside now.Theriot
Without having tested it. Why a tuple of two outer_dict[key]['downloads'] after the first lambda x:?Conservation
M
1
a = {'KEY1':{'name':'google','date':20100701,'downloads':0},
 'KEY2':{'name':'chrome','date':20071010,'downloads':0},
 'KEY3':{'name':'python','date':20100710,'downloads':100}}


z = a.items()

z.sort(key=lambda x: (x[1]['downloads'], x[1]['date']))
Marcellus answered 5/11, 2010 at 22:42 Comment(1)
z.sort(key=lambda x: (x[1]['downloads'], x[1]['date'])) returns None. Any idea why?Henhouse

© 2022 - 2024 — McMap. All rights reserved.