Sorting a mixed list of ints and strings
Asked Answered
D

3

6

I am trying to sort the following mixed list of ints and strings, but getting a TypeError instead. My desired output order is sorted integers then sorted strings.

x=[4,6,9,'ashley','drooks','chay','poo','may']
>>> x.sort()
Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    x.sort()
TypeError: '<' not supported between instances of 'str' and 'int'
Drumstick answered 14/4, 2018 at 9:4 Comment(6)
What does it mean to sort numbers and strings together?Sallysallyann
It's telling you exactly what the problem is. But you, on the other hand, haven't told us what the sorted list would look like. Would numbers be sorted before strings? Or after strings? We can't fix your code without knowing what you want it to do.Chism
Ideally what should happen? I thought of sorting list items here. I am not sure how python will handle this? I thought integers will be sorted first and strings in the end.Drumstick
Ideally you get told it doesn't make any natural sense to sort such things together (which you have). You then decide what the rules are depending on what output you want... :)Quote
For the below cases, how should i modify the code? 1. sorted integers comes before sorted strings. 2. Sorted strings should come before sorted integersDrumstick
Also have a look at the answers about 'sort mixed ints and strings' in other languages. The keyword for asking your question was 'mixed', as in 'mixed datatypes'.Pappose
C
25

You can pass a custom key function to list.sort:

x = [4,6,9,'ashley','drooks','chay','poo','may']
x.sort(key=lambda v: (isinstance(v, str), v))

# result:
# [4, 6, 9, 'ashley', 'chay', 'drooks', 'may', 'poo']

This key function maps each element in the list to a tuple in which the first value is a boolean (True for strings and False for numbers) and the second value is the element itself, like this:

>>> [(isinstance(v, str), v) for v in x]
[(False, 4), (False, 6), (False, 9), (True, 'ashley'), (True, 'chay'),
 (True, 'drooks'), (True, 'may'), (True, 'poo')]

These tuples are then used to sort the list. Because False < True, this makes it so that integers are sorted before strings. Elements with the same boolean value are then sorted by the 2nd value in the tuple.

Chism answered 14/4, 2018 at 9:23 Comment(8)
"[(isinstance(v, str), v) for v in x]" I have never seen this statement. Can you please elaborate this with an example. I am not able to visualize what this statement do?Drumstick
x.sort(key=lambda v: (isinstance(v, str), v)). What does key lambda means and why we use it? What does isinstance is used for? "isinstance(v, str), v)"- what operation this line performs?Drumstick
@Aayush That's too much to explain here. Take a look at list comprehensions and lambda functions. The code (isinstance(v, str), v) creates a tuple where the first element is a boolean (the result of isinstance(v, str)) and the second element is v.Chism
@Chism I understand key=lambda v: (isinstance(v, str), v) returns (False,4) and so on i still don't get is how value v is passed to lambdaAggressive
@Aggressive The lambda is called by list.sort. It passes each value in the list to the key function as an argument, and v takes the value of that argument.Chism
Since your desired sort-order is 'integers first, strings last', you first need to look at the object's type (integers:0, strings:1), then the value. Hence your custom sort key needs to be a tuple (type,value). Take a read of the doc for list.sortPappose
This is brilliant.Smarmy
Does this sort lexicographically? Making sure that it works with any data typesFurey
L
2

I can see from your comment that you want integers to be sorted first then strings.

So we could sort two separate lists and join them as follows:

x=[4,6,9,'ashley','drooks','chay','poo','may']
intList=sorted([i for i in x if type(i) is int])
strList=sorted([i for i in x if type(i) is str])
print(intList+strList)

Output:

[4, 6, 9, 'ashley', 'chay', 'drooks', 'may', 'poo']

Lamdin answered 14/4, 2018 at 9:11 Comment(9)
can't we do without dividing it?Drumstick
I don't think so because you cannot compare integers with strings.Lamdin
i mean to say just sort the integer part in itself without touching the strings. Then sort the strings part. And according to ur code if i want strings first, i'll have to do like this? - print(strList+intList)Drumstick
@Lamdin you can... you just need to make a key that has a consistently sortable field... eg: sorted(data, key=lambda L: (isinstance(L, str), L)) to put non-str's first which'll work as long as they remain orderable among themselves...Quote
Oh I didn't know that.Lamdin
Ordinarily with a custom sort I'd want to define a key function, but in this case it's difficult to see how it could be done. Do you make the key an int and give all strings a 'lexicographic value' plus some offset that you don't expect the numbers in your list to be greater than? This is where Python falls down for not providing the option of passing a comparison function to sort instead. Writing a comparison function would be trivial.Guano
Huh, just as I was writing that @JonClements provides a key function! I never considered using tuples. Will bear this in mind, thanks.Guano
@JonClements I'd write this up as a separate answer if I were you.Guano
@Guano It's okay - Aran-Fey already has. Also fyi, if you really, really find a use case where you can't use a key, you can indeed use a comparison function as a key using functools.cmp_to_keyQuote
D
1

With function key

def func(i):
    return isinstance(i, str), i

stuff = ['Tractor', 184 ,'Lada', 11 ,'Ferrari', 5 , 'Chicken' , 68]
stuff.sort(key=func)

for x in stuff:
    print(x)

Change type str to int to get strings first.

Diabolism answered 9/8, 2020 at 0:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.