Python: Find index of minimum item in list of floats [duplicate]
Asked Answered
H

4

67

How can I find the index of the minimum item in a Python list of floats? If they were integers, I would simply do:

minIndex = myList.index(min(myList))

However, with a list of floats I get the following error, I assume because float equality comparison is rather iffy.

ValueError: 0.13417985135 is not in list

Now, I know that I could simply scroll through the list and compare each item to see whether it is < (min + 0.0000000000001) and > (min - 0.0000000000001), but that is kinda messy. Is there a more elegant (preferably built-in) way to find the index of the smallest item in a list of floats?

Hildebrand answered 9/11, 2012 at 1:45 Comment(4)
That should work with integers and floats ... Can you show us a specific example with floats where it doesn't work?Geibel
There are finitely many bits in a float, so the comparison won't be a problem.Terbium
Floating-point comparison is not iffy. Comparing two floating-point numbers for equality returns true if and only if the numbers are equal, in any floating-point implementation that is not horribly broken. One potential problem is NaNs in the list. In such case, a minimum operator could return a NaN, but an equality comparison would report the NaN is not equal to anything (including itself), and that could cause problems in a routine to return the index of the minimum value. If there is no NaN in the list, that suggests some other problem which is not addressed in any of the answers yet.Disquietude
this question is marked duplicate, but this has the better answer than the one that is linked on the banner on the top imo, ymmv.Dockhand
J
76

You're effectively scanning the list once to find the min value, then scanning it again to find the index, you can do both in one go:

from operator import itemgetter
min(enumerate(a), key=itemgetter(1))[0] 
Josphinejoss answered 9/11, 2012 at 1:56 Comment(7)
Ah, neat. I didn't realize that min (and presumably max) accept a key argument. +1!Scallop
@DavidWolever Yup min and max (like sort and sorted) take a key comparison function - very useful - just wish set took it for instanceJosphinejoss
Still not as fast as OP's code for a list of size 1000 :)Geibel
How would this handle duplicate min values?Trimming
@Trimming Why would that matter? If there's identical minimum values, then the minimum is still the minimum...Josphinejoss
For example mylist = [1,2,1]. Using your method would return (0,1) (i.e. the first index in the list), when, in fact, there are two minimum values. Ideally, a solution would take this into accountTrimming
@Trimming why? It emulates the behaviour of list.index (which is what the OP is trying to do) which finds the index of the first minimum value... Could you describe what you reckon this answer needs to take into account?Josphinejoss
S
116

I would use:

val, idx = min((val, idx) for (idx, val) in enumerate(my_list))

Then val will be the minimum value and idx will be its index.

Scallop answered 9/11, 2012 at 1:54 Comment(10)
+1. This is both more efficient, and simpler (once you understand generator expressions), than first getting the minimum, and then finding it. Plus it automatically works for values that can be ordered with < but not compared with == (which I don't think is actually the OP's problem, but he seems to think so).Martine
@Martine -- Careful making grand claims about efficiency. That is a size dependent problem. Consider: python -m timeit -s 'my_list = range(1000)[::-1]' 'my_list.index(min(my_list))' -- 96.8usec per loop (and carefully designed to be worst case scenario). python -m timeit -s 'my_list = range(1000)' 'min((val, idx) for (idx, val) in enumerate(my_list))' -- 345 usec per loop -- 3.5x slower for a list of size 1000. It does have other advantages though -- It works for any iterable for instance.Geibel
@mgilson: When I run the exact same test, I get 333us vs. 187us, or 2x faster rather than 3.5x slower… (Python 3.3 vs. 2.7? 64-bit vs. 32-bit? who knows?) Meanwhile, the index solution works for any iterable too; you just have to throw in a list(myiter). The space penalty might be unacceptable in some cases, but the time penalty probably won't be (since you're already doing N walks through the list).Martine
@Martine -- I'm using py27, on OS-X 10.5.8 (therefore 32 bit). On python3, range doesn't give a list ... Perhaps that could be an issue? (Although it runs faster for me :) -- 83.4 usecGeibel
@mgilson: Yes, but you can still slice a range, and index it, in 3.3 (and neither one requires internally creating a list). So, it could be an issue, but I wouldn't expect it to be. If I had to guess, I'd guess that something has been done to improve generator expression performance between 2.7 and 3.3, or enumerate's performance with iterables… but if I really cared, I'd do some research instead of guessing.Martine
@Martine -- Yeah, I was surprised all those various operations worked on range when I tried it. That's neat. I'm just saying that we're comparing apples and oranges. But, FWIW, for me, I get more extreme results using py32 -- not in favor of the generator. So if generators got an overhaul, it was between 3.2 and 3.3 :). Otherwise, it might be our different OS's or ... I really enjoy these little python performance comparisons ... maybe it's because python makes it so easy to benchmark different things ...Geibel
@mgilson: I'm on OS X 10.8.2. From what I've seen, at least on Mac, 3.x has had a lot of optimizations that help 64-bit but not 32-bit. And indeed, when I run 3.3 in 32-bit mode, the generator solution goes from 187us to 255us. But the big question here is why something that takes 83us on your presumably older and slower Mac takes 333 on my nearly-top-of-the-line mid-2012 model…Martine
@David Wolever what if there are two or more minimum values in the same list. how do you get the rest, how do you print them?Meaganmeager
@fractal_7 you could do some tricks with itertools.groupby: _, minimums = next(groupby(sorted((val, idx) for (idx, val) in enumerate(my_list)), key=lambda x: x[0]), []). But this isn't nearly as pretty, so at that point I might write a function which does that in a nicer way.Scallop
@David Wolever thanks for the response. i did it that way minval = min(mylist) for idx, val in enumerate(mylist): if val == minval: print idx but i'm going to try it with your way too.Meaganmeager
J
76

You're effectively scanning the list once to find the min value, then scanning it again to find the index, you can do both in one go:

from operator import itemgetter
min(enumerate(a), key=itemgetter(1))[0] 
Josphinejoss answered 9/11, 2012 at 1:56 Comment(7)
Ah, neat. I didn't realize that min (and presumably max) accept a key argument. +1!Scallop
@DavidWolever Yup min and max (like sort and sorted) take a key comparison function - very useful - just wish set took it for instanceJosphinejoss
Still not as fast as OP's code for a list of size 1000 :)Geibel
How would this handle duplicate min values?Trimming
@Trimming Why would that matter? If there's identical minimum values, then the minimum is still the minimum...Josphinejoss
For example mylist = [1,2,1]. Using your method would return (0,1) (i.e. the first index in the list), when, in fact, there are two minimum values. Ideally, a solution would take this into accountTrimming
@Trimming why? It emulates the behaviour of list.index (which is what the OP is trying to do) which finds the index of the first minimum value... Could you describe what you reckon this answer needs to take into account?Josphinejoss
E
46

Use of the argmin method for numpy arrays.

import numpy as np
np.argmin(myList)

However, it is not the fastest method: it is 3 times slower than OP's answer on my computer. It may be the most concise one though.

Extraneous answered 11/7, 2014 at 11:40 Comment(0)
G
18

I think it's worth putting a few timings up here for some perspective.

All timings done on OS-X 10.5.8 with python2.7

John Clement's answer:

python -m timeit -s 'my_list = range(1000)[::-1]; from operator import itemgetter' 'min(enumerate(my_list),key=itemgetter(1))'
1000 loops, best of 3: 239 usec per loop    

David Wolever's answer:

python -m timeit -s 'my_list = range(1000)[::-1]' 'min((val, idx) for (idx, val) in enumerate(my_list))
1000 loops, best of 3: 345 usec per loop

OP's answer:

python -m timeit -s 'my_list = range(1000)[::-1]' 'my_list.index(min(my_list))'
10000 loops, best of 3: 96.8 usec per loop

Note that I'm purposefully putting the smallest item last in the list to make .index as slow as it could possibly be. It would be interesting to see at what N the iterate once answers would become competitive with the iterate twice answer we have here.

Of course, speed isn't everything and most of the time, it's not even worth worrying about ... choose the one that is easiest to read unless this is a performance bottleneck in your code (and then profile on your typical real-world data -- preferably on your target machines).

Geibel answered 9/11, 2012 at 2:9 Comment(1)
+1 for the last paragraph. Especially because the OP specifically asked for the most elegant way, not the fastest. It's also worth mentioning that, if the OP were right about floats not being self-equal (even though I don't think that's an actual problem…), the first two solutions would still work.Martine

© 2022 - 2024 — McMap. All rights reserved.