Efficient way to get index of minimum value in long vector, python
Asked Answered
G

6

9

I have a long list of longitude values (len(Lon) = 420481), and another one of latitude values. I want to find the corresponding latitude to the minimum of the longitude.

I tried:

SE_Lat = [Lat[x] for x,y in enumerate(Lon) if y == min(Lon)]

but this takes ages to finish.

Does anyone know a more efficient way?

Maybe you also have a suggestions for this: I now try to find the closest corresponding latitude to a new longitude, which is not in the original longitude vector. I tried this:

minDiff = [min(abs(x - lon_new) for x in lons)] # not very quick, but works
[(lat,lon) for lat,lon in izip(lats,lons) if abs(lon-lon_new)==minDiff]

The last line throws an error, because there are multiple matches. I don't know at the moment how to find only one value, lets say the first. Any help is greatly appreciated!

Genitor answered 18/5, 2011 at 12:26 Comment(0)
C
7

May I recommend numpy?

import numpy
nplats = numpy.array(lats)
nplons = numpy.array(lons)

# this part is 20x faster than using the built-in python functions
index = numpy.argmin(nplats)

print nplats[index], nplons[index]

this is way faster than the min(izip()) solution (~20x using my setup when using 420481 randomly created records), although of course you'd need to store your data values in numpy to take advantage of this speed-up.

Coated answered 18/5, 2011 at 13:34 Comment(0)
F
6
min(itertools.izip(Lat, Lon), key=operator.itemgetter(1))[0]
Fitzsimmons answered 18/5, 2011 at 12:31 Comment(3)
importing itertools's lazy-zip are necessary at all, since finding the min must necessarily look at every single element and thus will expand every element in the iterator (additionally in python3, zip is lazy be default)Wirewove
That's still a lot of elements, and generating the list in the first place will be slow.Fitzsimmons
This is not an issue in python3, but after testing, you are correct for python2. +1 =) -- for the record, just do x=min(zip(range(10**6))) with both zip and izip in python and python3; zip is fast in python3, izip is as fast in python2, and zip is very slow in python2.Wirewove
N
4

Rather than jumping right in with one of the many alternatives for solving this (which can be seen in the other answers), it's worth enumerating why the code in the original example is so slow.

SE_Lat = [Lat[x] for x,y in enumerate(Lon) if y == min(Lon)]

We know from the OP that len(Lon) == 420481. Now, finding the minimum value is an O(N) operation (you have to look at every value at least once). In a list comprehension, the condition is reevaluated on every iteration. The above code recalculates the minimum value on every pass through the loop, blowing what should be an O(N) operation out to be O(N^2) (A mere 177 billion iterations in this case).

Simply caching the result of min(Lon) in a local variable and using that in the loop condition instead of recalculating it every iteration would likely bring the runtime down to an acceptable level.

However, the way I would personally go about it (assuming I wanted all of the latitude, longitude and index later on):

min_longitude, min_index = min(longitude, index for index, longitude in enumerate(Lon))
min_latitude = Lat[min_index]

There are plenty of possibilities though, and which one is best will vary based on the exact use case.

Nominee answered 18/5, 2011 at 13:39 Comment(0)
W
0
pairs = zip(latitudes, longitudes)
minLonPair = min(pairs, key=lambda p:p[1])
print(minLonPair[0])

As per Ignacio's solution, if you are using python2, you will want to use izip rather than zip. This is, however, true for everything you do in python2.

Wirewove answered 18/5, 2011 at 12:30 Comment(0)
K
0

Here was my original answer:

>>> lats = [1,2,3,4]
>>> lons = [5,4,8,9]
>>> from itertools import izip
>>> min(izip(lats,lons), key=lambda x:x[1])
(2, 4)

But I see that the OP seemed to be allowing for there being multiple matches at the minimum lon value, and for this, I don't think there is a one-liner. The trick is, you only want to find min(lons) once, not once for every lat,lon pair:

>>> lats = [1,2,3,4]
>>> lons = [5,4,8,4]
>>> minlon = min(lons)
>>> [(lat,lon) for lat,lon in izip(lats,lons) if lon==minlon]
[(2, 4), (4, 4)]

This one-liner might work for you, since the lambda argument minlon should only be computed once:

>>> filter(lambda latlon,minlon=min(lons):latlon[1]==minlon, izip(lats,lons))
[(2, 4), (4, 4)]

Not sure how well it will work on 420481-element lists though. And for readability and long-term support, I would probably choose the more explicit 2-liner solution.

Last point: Sometimes you only get one pass through a sequence, such as when it is an iterator, or the output of a generator. To support multiple matches and take only one pass through the two lists, this was the best I could do:

from itertools import izip

def get_lats_at_min_lon(lats, lons):
    minlon = 200
    minlats = []
    for lat,lon in izip(lats, lons):
        if lon < minlon:
            minlats = [lat]
            minlon = lon
        elif lon == minlon:
            minlats.append(lat)
    return minlon, minlats

lats = iter([1,2,3,4])
lons = iter([5,4,8,4])

print get_lats_at_min_lon(lats,lons)

Prints:

(4, [2, 4])
Keavy answered 18/5, 2011 at 12:32 Comment(1)
Thanks for all the answers folks! Pretty much everything you suggested worked well and quick. I used the one-liner with filter, which works great.Genitor
S
0

Just first find the index:

index = min(enumerate(Lon), key=operator.itemgetter(1))[1] 
Lat[index]
Sumy answered 18/5, 2011 at 12:36 Comment(1)
Are you sure for the final [1]? I believe it should be [0], since it's the index you want.Rizzo

© 2022 - 2024 — McMap. All rights reserved.