How to sort a list of strings?
Asked Answered
M

11

493

What is the best way of creating an alphabetically sorted list in Python?

Mccusker answered 30/8, 2008 at 17:3 Comment(1)
Use locale and it's string collation methods to sort naturally according to current locale.Clarino
B
582

Basic answer:

mylist = ["b", "C", "A"]
mylist.sort()

This modifies your original list (i.e. sorts in-place). To get a sorted copy of the list, without changing the original, use the sorted() function:

for x in sorted(mylist):
    print x

However, the examples above are a bit naive, because they don't take locale into account, and perform a case-sensitive sorting. You can take advantage of the optional parameter key to specify custom sorting order (the alternative, using cmp, is a deprecated solution, as it has to be evaluated multiple times - key is only computed once per element).

So, to sort according to the current locale, taking language-specific rules into account (cmp_to_key is a helper function from functools):

sorted(mylist, key=cmp_to_key(locale.strcoll))

And finally, if you need, you can specify a custom locale for sorting:

import locale
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8') # vary depending on your lang/locale
assert sorted((u'Ab', u'ad', u'aa'),
  key=cmp_to_key(locale.strcoll)) == [u'aa', u'Ab', u'ad']

Last note: you will see examples of case-insensitive sorting which use the lower() method - those are incorrect, because they work only for the ASCII subset of characters. Those two are wrong for any non-English data:

# this is incorrect!
mylist.sort(key=lambda x: x.lower())
# alternative notation, a bit faster, but still wrong
mylist.sort(key=str.lower)
Beria answered 30/8, 2008 at 17:10 Comment(8)
mylist.sort(key=str.lower) is faster.Baldhead
Good point. I'll leave my current example as-is, since it's probably easier for a beginner to see what's happening, but I'll keep that in mind in the future.Beria
If anyone is curious, performance of list.sort() can be found hereHodess
How does this mylist.sort(key=str.lower) work? What is the name of this function passing construct?Divergence
@J.F.Sebastian str.lower will not sort correctly for non-ASCII characters, let this example: ['a', 'b', 'â']Vesperal
@BornToCode : 1- I know. Look at the revision (2008) my comment replies to (my comment is about the unnecessary use of lambda). 2- sorting non-ASCII characters is a big separate topic. PyICU could be used instead of the locale-based solution.Baldhead
I can't believe it but it doesn't work: print([1, 2, 3].sort()) returns None!Hygrothermograph
@Hygrothermograph This is because you are printing the return value of the sort function called in [1, 2, 3].sort(). As sort() sorts the list in place (ie, changes the list directly), it doesn't return the sorted list, and actually doesn't return anything, so your print statement prints None. If you saved your list to a variable, say x, called x.sort(), then print(x), you would see the sorted list.Interdenominational
L
62

It is also worth noting the sorted() function:

for x in sorted(list):
    print x

This returns a new, sorted version of a list without changing the original list.

Loftus answered 30/8, 2008 at 22:14 Comment(0)
A
40
list.sort()

It really is that simple :)

Amaryllidaceous answered 30/8, 2008 at 17:4 Comment(1)
In general, it is surprisingly far from simple. But ok, the simplest case is actually simple.Jumbala
H
19

The proper way to sort strings is:

import locale
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8') # vary depending on your lang/locale
assert sorted((u'Ab', u'ad', u'aa'), cmp=locale.strcoll) == [u'aa', u'Ab', u'ad']

# Without using locale.strcoll you get:
assert sorted((u'Ab', u'ad', u'aa')) == [u'Ab', u'aa', u'ad']

The previous example of mylist.sort(key=lambda x: x.lower()) will work fine for ASCII-only contexts.

Huldahuldah answered 28/10, 2009 at 22:45 Comment(0)
A
18

Please use sorted() function in Python3

items = ["love", "like", "play", "cool", "my"]
sorted(items2)
Abridgment answered 27/12, 2017 at 13:24 Comment(0)
B
8

But how does this handle language specific sorting rules? Does it take locale into account?

No, list.sort() is a generic sorting function. If you want to sort according to the Unicode rules, you'll have to define a custom sort key function. You can try using the pyuca module, but I don't know how complete it is.

Belabor answered 30/8, 2008 at 18:10 Comment(0)
P
2
l =['abc' , 'cd' , 'xy' , 'ba' , 'dc']
l.sort()
print(l)

Result

['abc', 'ba', 'cd', 'dc', 'xy']

Passional answered 7/12, 2019 at 7:21 Comment(0)
S
1

Old question, but if you want to do locale-aware sorting without setting locale.LC_ALL you can do so by using the PyICU library as suggested by this answer:

import icu # PyICU

def sorted_strings(strings, locale=None):
    if locale is None:
       return sorted(strings)
    collator = icu.Collator.createInstance(icu.Locale(locale))
    return sorted(strings, key=collator.getSortKey)

Then call with e.g.:

new_list = sorted_strings(list_of_strings, "de_DE.utf8")

This worked for me without installing any locales or changing other system settings.

(This was already suggested in a comment above, but I wanted to give it more prominence, because I missed it myself at first.)

Scotsman answered 28/8, 2019 at 12:39 Comment(0)
I
0

Suppose s = "ZWzaAd"

To sort above string the simple solution will be below one.

print ''.join(sorted(s))
Iberian answered 12/5, 2017 at 6:16 Comment(1)
that is not a list of strings you are sorting hereScream
S
0

Or maybe:

names = ['Jasmine', 'Alberto', 'Ross', 'dig-dog']
print ("The solution for this is about this names being sorted:",sorted(names, key=lambda name:name.lower()))
Schmid answered 13/8, 2018 at 15:46 Comment(0)
C
0

It is simple: https://trinket.io/library/trinkets/5db81676e4

scores = '54 - Alice,35 - Bob,27 - Carol,27 - Chuck,05 - Craig,30 - Dan,27 - Erin,77 - Eve,14 - Fay,20 - Frank,48 - Grace,61 - Heidi,03 - Judy,28 - Mallory,05 - Olivia,44 - Oscar,34 - Peggy,30 - Sybil,82 - Trent,75 - Trudy,92 - Victor,37 - Walter'

scores = scores.split(',') for x in sorted(scores): print(x)

Cockcroft answered 3/6, 2020 at 2:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.