sorting list in python
Asked Answered
F

5

8

if I have a list of strings e.g. ["a143.txt", "a9.txt", ] how can I sort it in ascending order by the numbers in the list, rather than by the string. I.e. I want "a9.txt" to appear before "a143.txt" since 9 < 143.

thanks.

Folk answered 30/3, 2011 at 20:25 Comment(4)
This question does not appear to have anything to do with scipy or numpy. If this is the case, please remove those tags.Helton
Edited tags. Now it's more clear.Dna
possible duplicate of How do you sort files numerically?Roye
Possible duplicate of Python analog of natsort function (sort a list using a "natural order" algorithm)Theona
C
14

It's called "natural sort order", From http://www.codinghorror.com/blog/2007/12/sorting-for-humans-natural-sort-order.html

Try this:

import re 

def sort_nicely( l ): 
  """ Sort the given list in the way that humans expect. 
  """ 
  convert = lambda text: int(text) if text.isdigit() else text 
  alphanum_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ] 
  l.sort( key=alphanum_key ) 
Candelabra answered 30/3, 2011 at 20:28 Comment(4)
I'd use text.lower() at the end of the convert = line to make it case-insensitive.Ursas
+1. You might want to replace the lambda with a proper function definition, for readability. Incidentally, Debian package version numbers are compared more or less like this. debian.org/doc/debian-policy/ch-controlfields.html#s-f-VersionViridis
+1 Nice answer. The only thing I didin't like are extra white-spaces. I mean here: [ convert(c) for c in re.split('([0-9]+)', key) ] and l.sort( key=alphanum_key ) and sort_nicely( l )Dna
+1, nicely done! I redid alphanum_key as alphanum_key = lambda key: map(convert, re.split('([0-9]+)', key)).Marni
P
0

Use list.sort() and provide your own function for the key argument. Your function will be called for each item in the list (and passed the item), and is expected to return a version of that item that will be sorted.

See http://wiki.python.org/moin/HowTo/Sorting/#Key_Functions for more information.

Piraeus answered 30/3, 2011 at 20:29 Comment(0)
L
0

If you want to completely disregard the strings, then you should do

import re
numre = re.compile('[0-9]+')
def extractNum(s):
    return int(numre.search(s).group())

myList = ["a143.txt", "a9.txt", ]
myList.sort(key=extractNum)
Livy answered 30/3, 2011 at 20:32 Comment(0)
C
0

list.sort() is deprecated (see Python.org How-To) . sorted(list, key=keyfunc) is better.

import re

def sortFunc(item):
  return int(re.search(r'[a-zA-Z](\d+)', item).group(1))

myList = ["a143.txt", "a9.txt"]

print sorted(myList, key=sortFunc)
Clemmie answered 30/3, 2011 at 20:50 Comment(5)
list.sort() is deprecated? "Usually it's less convenient than sorted()" is the only thing in this direction I found. I have to say, though, that I'd be more than happy to see the in-place sorting go away, but it seems unlikely.Hovis
It is not deprecated. docs.python.org/library/stdtypes.html#mutable-sequence-typesDna
It may not be technically depreciated but it is considered the "old" method and is labeled as such on Python.org.Clemmie
True list.sort() is certainly slightly more memory efficient but the difference is negligible as far sensible sized lists are concerned. I can't find a good explanation for why but as far as I am aware the preferred and more 'pythonic' way of doing sorting is using sorted().Clemmie
It says sorted() returns a NEW list, not that's it's NEW. It's definitely less efficient if you really interested in modifying your list. sorted() is good for tuples.Dna
H
0
>>> paths = ["a143.txt", "a9.txt"]
>>> sorted(paths, key=lambda s: int(re.search("\d+", s).group()))
['a9.txt', 'a143.txt']

More generic, if you want it to work also for files like: a100_32_12 (and sorting by numeric groups):

>>> paths = ["a143_2.txt", "a143_1.txt"]
>>> sorted(paths, key=lambda s: map(int, re.findall("\d+", s)))
['a143_1.txt', 'a143_1.txt']
Hovis answered 30/3, 2011 at 20:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.