I have the following list created from a sorted csv
list1 = sorted(csv1, key=operator.itemgetter(1))
I would actually like to sort the list by two criteria: first by the value in field 1 and then by the value in field 2. How do I do this?
I have the following list created from a sorted csv
list1 = sorted(csv1, key=operator.itemgetter(1))
I would actually like to sort the list by two criteria: first by the value in field 1 and then by the value in field 2. How do I do this?
like this:
import operator
list1 = sorted(csv1, key=operator.itemgetter(1, 2))
operator
is a module that needs to be imported. –
Charlsiecharlton No need to import anything when using lambda functions.
The following sorts list
by the first element, then by the second element. You can also sort by one field ascending and another descending for example:
sorted_list = sorted(list, key=lambda x: (x[0], -x[1]))
lambda x: (x[0],int(x[1]))
. +1 –
Thorwald x[1]
is date? Should I convert it to integer also? @Thorwald does conversion of string to int preserve the alphabetical ordering of the string? –
Motto print("a"<"b")
will print True
. –
Silma -
in -x[1]
stand for? –
Collbaith like this:
import operator
list1 = sorted(csv1, key=operator.itemgetter(1, 2))
operator
is a module that needs to be imported. –
Charlsiecharlton Python has a stable sort, so provided that performance isn't an issue the simplest way is to sort it by field 2 and then sort it again by field 1.
That will give you the result you want, the only catch is that if it is a big list (or you want to sort it often) calling sort twice might be an unacceptable overhead.
list1 = sorted(csv1, key=operator.itemgetter(2))
list1 = sorted(list1, key=operator.itemgetter(1))
Doing it this way also makes it easy to handle the situation where you want some of the columns reverse sorted, just include the 'reverse=True' parameter when necessary.
Otherwise you can pass multiple parameters to itemgetter or manually build a tuple. That is probably going to be faster, but has the problem that it doesn't generalise well if some of the columns want to be reverse sorted (numeric columns can still be reversed by negating them but that stops the sort being stable).
So if you don't need any columns reverse sorted, go for multiple arguments to itemgetter, if you might, and the columns aren't numeric or you want to keep the sort stable go for multiple consecutive sorts.
Edit: For the commenters who have problems understanding how this answers the original question, here is an example that shows exactly how the stable nature of the sorting ensures we can do separate sorts on each key and end up with data sorted on multiple criteria:
DATA = [
('Jones', 'Jane', 58),
('Smith', 'Anne', 30),
('Jones', 'Fred', 30),
('Smith', 'John', 60),
('Smith', 'Fred', 30),
('Jones', 'Anne', 30),
('Smith', 'Jane', 58),
('Smith', 'Twin2', 3),
('Jones', 'John', 60),
('Smith', 'Twin1', 3),
('Jones', 'Twin1', 3),
('Jones', 'Twin2', 3)
]
# Sort by Surname, Age DESCENDING, Firstname
print("Initial data in random order")
for d in DATA:
print("{:10s} {:10s} {}".format(*d))
print('''
First we sort by first name, after this pass all
Twin1 come before Twin2 and Anne comes before Fred''')
DATA.sort(key=lambda row: row[1])
for d in DATA:
print("{:10s} {:10s} {}".format(*d))
print('''
Second pass: sort by age in descending order.
Note that after this pass rows are sorted by age but
Twin1/Twin2 and Anne/Fred pairs are still in correct
firstname order.''')
DATA.sort(key=lambda row: row[2], reverse=True)
for d in DATA:
print("{:10s} {:10s} {}".format(*d))
print('''
Final pass sorts the Jones from the Smiths.
Within each family members are sorted by age but equal
age members are sorted by first name.
''')
DATA.sort(key=lambda row: row[0])
for d in DATA:
print("{:10s} {:10s} {}".format(*d))
This is a runnable example, but to save people running it the output is:
Initial data in random order
Jones Jane 58
Smith Anne 30
Jones Fred 30
Smith John 60
Smith Fred 30
Jones Anne 30
Smith Jane 58
Smith Twin2 3
Jones John 60
Smith Twin1 3
Jones Twin1 3
Jones Twin2 3
First we sort by first name, after this pass all
Twin1 come before Twin2 and Anne comes before Fred
Smith Anne 30
Jones Anne 30
Jones Fred 30
Smith Fred 30
Jones Jane 58
Smith Jane 58
Smith John 60
Jones John 60
Smith Twin1 3
Jones Twin1 3
Smith Twin2 3
Jones Twin2 3
Second pass: sort by age in descending order.
Note that after this pass rows are sorted by age but
Twin1/Twin2 and Anne/Fred pairs are still in correct
firstname order.
Smith John 60
Jones John 60
Jones Jane 58
Smith Jane 58
Smith Anne 30
Jones Anne 30
Jones Fred 30
Smith Fred 30
Smith Twin1 3
Jones Twin1 3
Smith Twin2 3
Jones Twin2 3
Final pass sorts the Jones from the Smiths.
Within each family members are sorted by age but equal
age members are sorted by first name.
Jones John 60
Jones Jane 58
Jones Anne 30
Jones Fred 30
Jones Twin1 3
Jones Twin2 3
Smith John 60
Smith Jane 58
Smith Anne 30
Smith Fred 30
Smith Twin1 3
Smith Twin2 3
Note in particular how in the second step the reverse=True
parameter keeps the firstnames in order whereas simply sorting then reversing the list would lose the desired order for the third sort key.
list1 = sorted(csv1, key=lambda x: (x[1], x[2]) )
tuple()
can receive two arguments (or rather, three, if you count with self
) –
Reinforcement return
statement should be return tuple((x[1], x[2]))
or simply return x[1], x[2]
. Refer @Silma answer below if you're looking for sorting in different directions –
Miguel tuple(x[1:3])
, if you want to use the tuple constructor for some reason instead of just a tuple display list x[1], x[2]
. Or keyfunc = operator.itemgetter(1, 2)
and don't even write a function yourself. –
Parthenos list1 = sorted(csv1, key=lambda x: x[1] and x[2] )
? If not what would be the behaviour in this case? –
Akela employees.sort(key = lambda x:x[1])
employees.sort(key = lambda x:x[0])
We can also use .sort with lambda 2 times because python sort is in place and stable. This will first sort the list according to the second element, x[1]
. Then, it will sort the first element, x[0]
(highest priority).
employees[0] = "Employee's Name"
employees[1] = "Employee's Salary"
This is equivalent to doing the following:
employees.sort(key = lambda x:(x[0], x[1]))
Sorting list of dicts using below will sort list in descending order on first column as salary and second column as age
d=[{'salary':123,'age':23},{'salary':123,'age':25}]
d=sorted(d, key=lambda i: (i['salary'], i['age']),reverse=True)
Output: [{'salary': 123, 'age': 25}, {'salary': 123, 'age': 23}]
If you want to sort the array based on,
based on both ascending and then descending order follow the method mentioned below. For that, you can use the lambda function.
let us consider below example,
input: [[1,2],[3,3],[2,1],[1,1],[4,1],[3,1]]
expected output: [[4, 1], [3, 1], [3, 3], [2, 1], [1, 1], [1, 2]]
code used:
arr = [[1,2],[3,3],[2,1],[1,1],[4,1],[3,1]]
arr.sort(key=lambda ele: (ele[0], -ele[1]), reverse=True)
# output [[4, 1], [3, 1], [3, 3], [2, 1], [1, 1], [1, 2]]
The negative sign is the reason the procedure the result expected.
In ascending order you can use:
sorted_data= sorted(non_sorted_data, key=lambda k: (k[1],k[0]))
or in descending order you can use:
sorted_data= sorted(non_sorted_data, key=lambda k: (k[1],k[0]),reverse=True)
After reading the answers in this thread, I wrote a general solution that will work for an arbitrary number of columns:
def sort_array(array, *columns):
for col in columns:
array.sort(key = lambda x:x[col])
The OP would call it like this:
sort_array(list1, 2, 1)
Which sorts first by column 2, then by column 1.
(Most important column goes last)
python 3 https://docs.python.org/3.5/howto/sorting.html#the-old-way-using-the-cmp-parameter
from functools import cmp_to_key
def custom_compare(x, y):
# custom comparsion of x[0], x[1] with y[0], y[1]
return 0
sorted(entries, key=lambda e: (cmp_to_key(custom_compare)(e[0]), e[1]))
© 2022 - 2024 — McMap. All rights reserved.
__lt__()
method on your class or inherit from some class that does"? That would make it a far better canonical. – Otilia