How can I geolocate a bunch of IP addresses with Python?
Asked Answered
R

4

5

I have a list of ~300 IP addresses that I would like to plot on a map of the world. Can you explain roughly how I could do that with Python?

EDIT: I'm also interested in the visualization part of the question

Resiniferous answered 26/4, 2012 at 18:33 Comment(0)
L
7

You can use the hostip.info API. For example:

http://api.hostip.info/get_html.php?ip=64.233.160.0

So your Python code using urllib2 would be:

import urllib2
f = urllib2.urlopen("http://api.hostip.info/get_html.php?ip=64.233.160.0")
data = f.read()
f.close()

Then retrieve the data from that returned result.

If you require longitude and latitude, use the position=true flag:

http://api.hostip.info/get_html.php?ip=64.233.160.0&position=true
Lashawnda answered 26/4, 2012 at 18:39 Comment(1)
Pymaps (A wrapper for Google Maps API) looks to be your solution for creating the actual maps. code.google.com/p/pymaps/wiki/PymapsHowtoLashawnda
E
3

Here is my solution in Python 3.x to return geo-location info given a dataframe containing IP Address(s); efficient parallelized application of function on vectorized pd.series/dataframe is the way to go.

For plotting of records on the map, subsetting latitude and longitude information followed by using suitable Mapping API's like the Google Maps Api or tableau helps towards Data Visualization.

Will contrast performance of two popular libraries to return location.

TLDR: use geolite2 method.

1. geolite2 package from geolite2 library

Input

# !pip install maxminddb-geolite2
import time
from geolite2 import geolite2
geo = geolite2.reader()
df_1 = train_data.loc[:50,['IP_Address']]

def IP_info_1(ip):
    try:
        try:
        x = geo.get(ip)
    except ValueError:   #Faulty IP value
        return np.nan
    try:
        return x['country']['names']['en'] if x is not None else np.nan
    except KeyError:   #Faulty Key value
        return np.nan

s_time = time.time()
# map IP --> country
#apply(fn) applies fn. on all pd.series elements
df_1['country'] = df_1.loc[:,'IP_Address'].apply(IP_info_1)
print(df_1.head(), '\n')
print('Time:',str(time.time()-s_time)+'s \n')

print(type(geo.get('48.151.136.76')))

Output

       IP_Address         country
0   48.151.136.76   United States
1    94.9.145.169  United Kingdom
2   58.94.157.121           Japan
3  193.187.41.186         Austria
4   125.96.20.172           China 

Time: 0.09906983375549316s 

<class 'dict'>

2. DbIpCity package from ip2geotools library

Input

# !pip install ip2geotools
import time
s_time = time.time()
from ip2geotools.databases.noncommercial import DbIpCity
df_2 = train_data.loc[:50,['IP_Address']]
def IP_info_2(ip):
    try:
        return DbIpCity.get(ip, api_key = 'free').country
    except:
        return np.nan
df_2['country'] = df_2.loc[:, 'IP_Address'].apply(IP_info_2)
print(df_2.head())
print('Time:',str(time.time()-s_time)+'s')

print(type(DbIpCity.get('48.151.136.76',api_key = 'free')))

Output

       IP_Address country
0   48.151.136.76      US
1    94.9.145.169      GB
2   58.94.157.121      JP
3  193.187.41.186      AT
4   125.96.20.172      CN

Time: 80.53318452835083s 

<class 'ip2geotools.models.IpLocation'>

A reason why the huge time difference could be due to the Data structure of the output, i.e direct subsetting from dictionaries seems way more efficient than indexing from the specicialized ip2geotools.models.IpLocation object.

Also, the output of the 1st method is dictionary containing geo-location data, subset respecitively to obtain needed info:

x = geolite2.reader().get('48.151.136.76')
print(x)

>>>
    {'city': {'geoname_id': 5101798, 'names': {'de': 'Newark', 'en': 'Newark', 'es': 'Newark', 'fr': 'Newark', 'ja': 'ニューアーク', 'pt-BR': 'Newark', 'ru': 'Ньюарк'}},

 'continent': {'code': 'NA', 'geoname_id': 6255149, 'names': {'de': 'Nordamerika', 'en': 'North America', 'es': 'Norteamérica', 'fr': 'Amérique du Nord', 'ja': '北アメリカ', 'pt-BR': 'América do Norte', 'ru': 'Северная Америка', 'zh-CN': '北美洲'}}, 

'country': {'geoname_id': 6252001, 'iso_code': 'US', 'names': {'de': 'USA', 'en': 'United States', 'es': 'Estados Unidos', 'fr': 'États-Unis', 'ja': 'アメリカ合衆国', 'pt-BR': 'Estados Unidos', 'ru': 'США', 'zh-CN': '美国'}}, 

'location': {'accuracy_radius': 1000, 'latitude': 40.7355, 'longitude': -74.1741, 'metro_code': 501, 'time_zone': 'America/New_York'}, 

'postal': {'code': '07102'}, 

'registered_country': {'geoname_id': 6252001, 'iso_code': 'US', 'names': {'de': 'USA', 'en': 'United States', 'es': 'Estados Unidos', 'fr': 'États-Unis', 'ja': 'アメリカ合衆国', 'pt-BR': 'Estados Unidos', 'ru': 'США', 'zh-CN': '美国'}}, 

'subdivisions': [{'geoname_id': 5101760, 'iso_code': 'NJ', 'names': {'en': 'New Jersey', 'es': 'Nueva Jersey', 'fr': 'New Jersey', 'ja': 'ニュージャージー州', 'pt-BR': 'Nova Jérsia', 'ru': 'Нью-Джерси', 'zh-CN': '新泽西州'}}]}
Evocation answered 12/7, 2019 at 12:20 Comment(0)
D
1

You can use GeoIP, which has both a free and a paid version. There is also a convenient Python API.

Denizen answered 26/4, 2012 at 18:41 Comment(0)
T
0

Use ipapi API. It is much better than ip2geotools (no requirement to install Visual C++ 14.0) or hostip.info API (not very accurate) or others mentioned above that may not go beyond country.

import requests

arr1=["ipaddress1","ipaddress2",...,"ipaddress300"]
    
def get_location(ip):
    ip_address = ip
    response = requests.get(f'https://ipapi.co/{ip_address}/json/').json()
    location_data = {
        "ip": ip_address,
        "city": response.get("city"),
        "region": response.get("region"),
        "country": response.get("country_name")
    }
    return location_data

for ip in arr1:
    print(get_location(ip))

For visualization you have many options such as basemap, folium, geopandas and plotly.

Taken with thanks from https://www.freecodecamp.org/news/how-to-get-location-information-of-ip-address-using-python/

Thermophone answered 16/11, 2023 at 19:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.