Sort a list of tuples by 2nd item (integer value) [duplicate]
Asked Answered
C

9

549

I have a list of tuples that looks something like this:

[('abc', 121),('abc', 231),('abc', 148), ('abc',221)]

I want to sort this list in ascending order by the integer value inside the tuples. Is it possible?

Chilton answered 22/5, 2012 at 2:48 Comment(1)
I just found that if no parameter is given, python's sorted() method will first sort the array by the first value in the tuple, then the second. For example, data = [('zbc', 121),('abc', 231),('gbc', 148), ('abc',221)] print(sorted(data)) print(sorted(data)) produces this result, note that the second value is sorted as well: data = [('zbc', 121),('abc', 231),('gbc', 148), ('abc',221)] print(sorted(data)) [('abc', 221), ('abc', 231), ('gbc', 148), ('zbc', 121)]Galvan
K
853

Try using the key keyword argument of sorted(), which sorts in increasing order by default:

sorted(
    [('abc', 121), ('abc', 231), ('abc', 148), ('abc', 221)], 
    key=lambda x: x[1]
)

key should be a function that identifies how to retrieve the comparable element from your data structure. In your case, it is the second element of the tuple, so we access [1].

For optimization, see jamylak's response using operator.itemgetter(1), which is essentially a faster version of lambda x: x[1].

Kiri answered 22/5, 2012 at 2:51 Comment(6)
While obvious. Sorted does not sort in place so: sorted_list = sorted([('abc', 121),('abc', 231),('abc', 148), ('abc',221)], key=lambda x: x[1])Hadj
,reverse=True for biggest to smallest.Iamb
This still works well with Python 3.7.Orel
You can also add multiple keys as a tuple, if you want one as reversed you can add a negative sign, this will sort using the first element first and then second element: sorted(some_list, lambda x: (x[0], -x[1],))Laryssa
What's gonna happen in above case if we don't provide any key?Marketable
I just wanted to say this is my most visited stackoverflow page of all time; i've been here like easily 500 times by now. Thank you cheeken, if only i could memorize this one line of code.Apologete
A
237
>>> from operator import itemgetter
>>> data = [('abc', 121),('abc', 231),('abc', 148), ('abc',221)]
>>> sorted(data,key=itemgetter(1))
[('abc', 121), ('abc', 148), ('abc', 221), ('abc', 231)]

IMO using itemgetter is more readable in this case than the solution by @cheeken. It is also faster since almost all of the computation will be done on the c side (no pun intended) rather than through the use of lambda.

>python -m timeit -s "from operator import itemgetter; data = [('abc', 121),('abc', 231),('abc', 148), ('abc',221)]" "sorted(data,key=itemgetter(1))"
1000000 loops, best of 3: 1.22 usec per loop

>python -m timeit -s "data = [('abc', 121),('abc', 231),('abc', 148), ('abc',221)]" "sorted(data,key=lambda x: x[1])"
1000000 loops, best of 3: 1.4 usec per loop
Astonishing answered 22/5, 2012 at 2:51 Comment(9)
+1 I agree that itemgetter() is a better solution. However, I thought a lambda expression would make it clearer how key functions.Kiri
+1 However, When I ran your testing of the speed I noticed 'human-eye' that the one that is supposed to be faster.. and measured faster, actually was noticeably slower. I scratched my head on this for a bit, then took the python timeout module out of play and just used linux time. i.e. time `python -c "the code"` then I got 'human-eye' results that you spell out, as well as sys clock times that were faster. Still not sure why this is, but it was reproducible. I gather it has something to do with the overhead of loading in the module's, but still does not quite make since to me, just yet.Confectioner
@JeffSheffield: Notice that jamylak is doing the import in the setup code (outside the timing), not the tested code. That's perfectly reasonable, because most programs will need to sort more than once, or need to sort much larger collections, but they'll only do the import once. (And for those programs that only need to do one smallish sort ever… well, you're talking about a difference of under a microsecond, so who cares either way?)Bondage
@Bondage FYI: jamylak is doing the import inside of the python -m timeit -s but yea I think you are on point to say that in a production scenario you only pay that lib load penalty once. and... as for who cares about that microsecond... you care because the assumption is that your sorting data is going to get quite large and that microsecond is going to turn into real seconds once the data set grows.Confectioner
@JeffSheffield: That's exactly the point: the cost of the import will not grow with the data, so even if it seems like a large part of the 1us you're paying for one smallish sort, it's going to be an irrelevant part of the 500ms you pay for a big sort, or a bunch of small sorts.Bondage
x = [[[5,3],1.0345],[[5,6],5.098],[[5,4],4.89],[[5,1],5.97]] With a list like this is can we sort using itemgetter() with respect to elements in x[0][1] ?Eaves
@Eaves I'm not sure if that can be done but I'm pretty sure that even if itemgetter could, lambda (solution above) would be clearer and hence more pythonic in that case. In your case though, (x[0] being the same for x's all elements) a simple sorted(x) will give you desired order. So, that, probably with a comment would be the most pythonic statement.Sartor
I extend the data to 48 elements and do it in Jupyter. The results are 5.19 µs ± 42.6 ns per loop for %timeit sorted(data, key=itemgetter(1)) and 6.7 µs ± 63.6 ns per loop for %timeit sorted(data, key=lambda x: x[1]). So itemgetter is still faster.Abysm
@LouisYang Thanks for sharing those results. It confirms what I expectedAstonishing
G
51

Adding to Cheeken's answer, This is how you sort a list of tuples by the 2nd item in descending order.

sorted([('abc', 121),('abc', 231),('abc', 148), ('abc',221)],key=lambda x: x[1], reverse=True)
Goodin answered 29/3, 2015 at 17:50 Comment(1)
Note that the original list will not be changed. the sorted function just produce a new list which is sorted for you.Oxidase
H
45

As a python neophyte, I just wanted to mention that if the data did actually look like this:

data = [('abc', 121),('abc', 231),('abc', 148), ('abc',221)]

then sorted() would automatically sort by the second element in the tuple, as the first elements are all identical.

Hertz answered 20/11, 2013 at 22:49 Comment(0)
D
29

For an in-place sort, use

foo = [(list of tuples)]
foo.sort(key=lambda x:x[0]) #To sort by first element of the tuple
Daytime answered 30/6, 2017 at 18:13 Comment(3)
Although this answer may be correct, it is better to explain why this answer is correct instead of providing code only. Additionally, this is almost an exact answer of one that already exists and was accepted 5 years ago, so this doesn't really add anything to the site. Take a look at newer questions to help people!Chandless
actually this helps people looking for an in-place sortStour
While this is helpful it would likely be more appropriate as a comment to the suggested answer indicating how one would use the same method as the one provided in that answer to accomplish the same task in-place.Hindgut
T
15

From python wiki:

>>> from operator import itemgetter, attrgetter    
>>> sorted(student_tuples, key=itemgetter(2))
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]    
>>> sorted(student_objects, key=attrgetter('age'))
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
Topdress answered 22/5, 2012 at 2:54 Comment(1)
x = [[[5,3],1.0345],[[5,6],5.098],[[5,4],4.89],[[5,1],5.97]] With a list like this is can we sort using itemgetter() with respect to elements in x[0][1] ?Eaves
H
8

For a lambda-avoiding method, first define your own function:

def MyFn(a):
    return a[1]

then:

sorted([('abc', 121),('abc', 231),('abc', 148), ('abc',221)], key=MyFn)
Heathenism answered 16/2, 2016 at 2:58 Comment(3)
What are the benefits of this?Dionysian
One benefit would be to have a defined function that you could use anywhere without having to put lambda x: x[1] in multiple areas of code.Bahrain
Another benefit is that you can document / comment better if it is a separate function.Handlebar
J
5

For Python 2.7+, this works which makes the accepted answer slightly more readable:

sorted([('abc', 121),('abc', 231),('abc', 148), ('abc',221)], key=lambda (k, val): val)
Jameyjami answered 5/2, 2017 at 11:1 Comment(0)
M
0

The fact that the sort values in the OP are integers isn't relevant to the question per se. In other words, the accepted answer would work if the sort value was text. I bring this up to also point out that the sort can be modified during the sort (for example, to account for upper and lower case).

>>> sorted([(121, 'abc'), (231, 'def'), (148, 'ABC'), (221, 'DEF')], key=lambda x: x[1])
[(148, 'ABC'), (221, 'DEF'), (121, 'abc'), (231, 'def')]
>>> sorted([(121, 'abc'), (231, 'def'), (148, 'ABC'), (221, 'DEF')], key=lambda x: str.lower(x[1]))
[(121, 'abc'), (148, 'ABC'), (231, 'def'), (221, 'DEF')]
Manque answered 1/6, 2017 at 15:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.