Sort a list of tuples by 2nd item (integer value) [duplicate]

Asked 22/5, 2012 at 2:48 Answered 30/6, 2017 at 18:13

549

I have a list of tuples that looks something like this:

[('abc', 121),('abc', 231),('abc', 148), ('abc',221)]

I want to sort this list in ascending order by the integer value inside the tuples. Is it possible?

Chilton answered 22/5, 2012 at 2:48 Comment(1)

I just found that if no parameter is given, python's sorted() method will first sort the array by the first value in the tuple, then the second. For example, data = [('zbc', 121),('abc', 231),('gbc', 148), ('abc',221)] print(sorted(data)) print(sorted(data)) produces this result, note that the second value is sorted as well: data = [('zbc', 121),('abc', 231),('gbc', 148), ('abc',221)] print(sorted(data)) [('abc', 221), ('abc', 231), ('gbc', 148), ('zbc', 121)] – Galvan 26/11, 2023 at 22:48

853

Try using the key keyword argument of sorted(), which sorts in increasing order by default:

sorted(
    [('abc', 121), ('abc', 231), ('abc', 148), ('abc', 221)], 
    key=lambda x: x[1]
)

key should be a function that identifies how to retrieve the comparable element from your data structure. In your case, it is the second element of the tuple, so we access [1].

For optimization, see jamylak's response using operator.itemgetter(1), which is essentially a faster version of lambda x: x[1].

Kiri answered 22/5, 2012 at 2:51 Comment(6)

While obvious. Sorted does not sort in place so: sorted_list = sorted([('abc', 121),('abc', 231),('abc', 148), ('abc',221)], key=lambda x: x[1]) – Hadj 18/4, 2018 at 16:46

,reverse=True for biggest to smallest. – Iamb 30/9, 2018 at 15:1

This still works well with Python 3.7. – Orel 30/10, 2018 at 20:44

You can also add multiple keys as a tuple, if you want one as reversed you can add a negative sign, this will sort using the first element first and then second element: sorted(some_list, lambda x: (x[0], -x[1],)) – Laryssa 10/3, 2019 at 17:34

What's gonna happen in above case if we don't provide any key? – Marketable 27/3, 2020 at 12:12

I just wanted to say this is my most visited stackoverflow page of all time; i've been here like easily 500 times by now. Thank you cheeken, if only i could memorize this one line of code. – Apologete 19/8, 2020 at 3:8

237

>>> from operator import itemgetter
>>> data = [('abc', 121),('abc', 231),('abc', 148), ('abc',221)]
>>> sorted(data,key=itemgetter(1))
[('abc', 121), ('abc', 148), ('abc', 221), ('abc', 231)]

IMO using itemgetter is more readable in this case than the solution by @cheeken. It is also faster since almost all of the computation will be done on the c side (no pun intended) rather than through the use of lambda.

>python -m timeit -s "from operator import itemgetter; data = [('abc', 121),('abc', 231),('abc', 148), ('abc',221)]" "sorted(data,key=itemgetter(1))"
1000000 loops, best of 3: 1.22 usec per loop

>python -m timeit -s "data = [('abc', 121),('abc', 231),('abc', 148), ('abc',221)]" "sorted(data,key=lambda x: x[1])"
1000000 loops, best of 3: 1.4 usec per loop

Astonishing answered 22/5, 2012 at 2:51 Comment(9)

+1 I agree that itemgetter() is a better solution. However, I thought a lambda expression would make it clearer how key functions. – Kiri 22/5, 2012 at 4:45

+1 However, When I ran your testing of the speed I noticed 'human-eye' that the one that is supposed to be faster.. and measured faster, actually was noticeably slower. I scratched my head on this for a bit, then took the python timeout module out of play and just used linux time. i.e. time `python -c "the code"` then I got 'human-eye' results that you spell out, as well as sys clock times that were faster. Still not sure why this is, but it was reproducible. I gather it has something to do with the overhead of loading in the module's, but still does not quite make since to me, just yet. – Confectioner 23/7, 2014 at 17:38

@JeffSheffield: Notice that jamylak is doing the import in the setup code (outside the timing), not the tested code. That's perfectly reasonable, because most programs will need to sort more than once, or need to sort much larger collections, but they'll only do the import once. (And for those programs that only need to do one smallish sort ever… well, you're talking about a difference of under a microsecond, so who cares either way?) – Bondage 4/9, 2014 at 2:24

@Bondage FYI: jamylak is doing the import inside of the python -m timeit -s but yea I think you are on point to say that in a production scenario you only pay that lib load penalty once. and... as for who cares about that microsecond... you care because the assumption is that your sorting data is going to get quite large and that microsecond is going to turn into real seconds once the data set grows. – Confectioner 4/9, 2014 at 14:5

@JeffSheffield: That's exactly the point: the cost of the import will not grow with the data, so even if it seems like a large part of the 1us you're paying for one smallish sort, it's going to be an irrelevant part of the 500ms you pay for a big sort, or a bunch of small sorts. – Bondage 4/9, 2014 at 17:37

x = [[[5,3],1.0345],[[5,6],5.098],[[5,4],4.89],[[5,1],5.97]] With a list like this is can we sort using itemgetter() with respect to elements in x[0][1] ? – Eaves 2/12, 2016 at 9:49

@Eaves I'm not sure if that can be done but I'm pretty sure that even if itemgetter could, lambda (solution above) would be clearer and hence more pythonic in that case. In your case though, (x[0] being the same for x's all elements) a simple sorted(x) will give you desired order. So, that, probably with a comment would be the most pythonic statement. – Sartor 22/3, 2017 at 4:49

I extend the data to 48 elements and do it in Jupyter. The results are 5.19 µs ± 42.6 ns per loop for %timeit sorted(data, key=itemgetter(1)) and 6.7 µs ± 63.6 ns per loop for %timeit sorted(data, key=lambda x: x[1]). So itemgetter is still faster. – Abysm 10/2, 2019 at 20:55

@LouisYang Thanks for sharing those results. It confirms what I expected – Astonishing 10/2, 2019 at 23:28

Adding to Cheeken's answer, This is how you sort a list of tuples by the 2nd item in descending order.

sorted([('abc', 121),('abc', 231),('abc', 148), ('abc',221)],key=lambda x: x[1], reverse=True)

Goodin answered 29/3, 2015 at 17:50 Comment(1)

Note that the original list will not be changed. the sorted function just produce a new list which is sorted for you. – Oxidase 30/10, 2019 at 2:50

As a python neophyte, I just wanted to mention that if the data did actually look like this:

data = [('abc', 121),('abc', 231),('abc', 148), ('abc',221)]

then sorted() would automatically sort by the second element in the tuple, as the first elements are all identical.

Hertz answered 20/11, 2013 at 22:49 Comment(0)

For an in-place sort, use

foo = [(list of tuples)]
foo.sort(key=lambda x:x[0]) #To sort by first element of the tuple

Daytime answered 30/6, 2017 at 18:13 Comment(3)

Although this answer may be correct, it is better to explain why this answer is correct instead of providing code only. Additionally, this is almost an exact answer of one that already exists and was accepted 5 years ago, so this doesn't really add anything to the site. Take a look at newer questions to help people! – Chandless 30/6, 2017 at 18:49

actually this helps people looking for an in-place sort – Stour 19/5, 2018 at 0:16

While this is helpful it would likely be more appropriate as a comment to the suggested answer indicating how one would use the same method as the one provided in that answer to accomplish the same task in-place. – Hindgut 14/3, 2019 at 23:6

From python wiki:

>>> from operator import itemgetter, attrgetter    
>>> sorted(student_tuples, key=itemgetter(2))
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]    
>>> sorted(student_objects, key=attrgetter('age'))
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]

Topdress answered 22/5, 2012 at 2:54 Comment(1)

x = [[[5,3],1.0345],[[5,6],5.098],[[5,4],4.89],[[5,1],5.97]] With a list like this is can we sort using itemgetter() with respect to elements in x[0][1] ? – Eaves 2/12, 2016 at 9:50

For a lambda-avoiding method, first define your own function:

def MyFn(a):
    return a[1]

then:

sorted([('abc', 121),('abc', 231),('abc', 148), ('abc',221)], key=MyFn)

Heathenism answered 16/2, 2016 at 2:58 Comment(3)

What are the benefits of this? – Dionysian 3/5, 2016 at 8:42

One benefit would be to have a defined function that you could use anywhere without having to put lambda x: x[1] in multiple areas of code. – Bahrain 13/7, 2016 at 14:59

Another benefit is that you can document / comment better if it is a separate function. – Handlebar 7/12, 2017 at 11:26

For Python 2.7+, this works which makes the accepted answer slightly more readable:

sorted([('abc', 121),('abc', 231),('abc', 148), ('abc',221)], key=lambda (k, val): val)

Jameyjami answered 5/2, 2017 at 11:1 Comment(0)

The fact that the sort values in the OP are integers isn't relevant to the question per se. In other words, the accepted answer would work if the sort value was text. I bring this up to also point out that the sort can be modified during the sort (for example, to account for upper and lower case).

>>> sorted([(121, 'abc'), (231, 'def'), (148, 'ABC'), (221, 'DEF')], key=lambda x: x[1])
[(148, 'ABC'), (221, 'DEF'), (121, 'abc'), (231, 'def')]
>>> sorted([(121, 'abc'), (231, 'def'), (148, 'ABC'), (221, 'DEF')], key=lambda x: str.lower(x[1]))
[(121, 'abc'), (148, 'ABC'), (231, 'def'), (221, 'DEF')]

Manque answered 1/6, 2017 at 15:15 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags