Getting the population of a city given its name
Asked Answered
M

2

13

What is a good python API I can use to get the population of a city? I have tried using geocoder, but it is not working - not sure why.

geocoder.population('San Francisco, California')

returns

'module' object has no attribute 'population'

Why is this happening, and how can I fix it?

Alternatively, is there a different python api I can use for this?

Mississippian answered 20/2, 2017 at 1:52 Comment(2)
Not sure why you want to build an API for this... a dict is all you need to map strings to numbers.Cernuous
Not trying to build an API, I want to find one that will get the populations for me...Mississippian
J
9

Certainly you can get the population of a city using geocoder and Google, but it requires an API key.

Here are two quite different alternative solutions:

OpenDataSoft

The first solution uses the OpenDataSoft API and basic Python 3.

The country needs to be specified via a two-letter country code, see examples below.

import requests
import json

def get_city_opendata(city, country):
    tmp = 'https://public.opendatasoft.com/api/records/1.0/search/?dataset=worldcitiespop&q=%s&sort=population&facet=country&refine.country=%s'
    cmd = tmp % (city, country)
    res = requests.get(cmd)
    dct = json.loads(res.content)
    out = dct['records'][0]['fields']
    return out

get_city_opendata('Berlin', 'de')

#{'city': 'berlin',
# 'country': 'de',
# 'region': '16',
# 'geopoint': [52.516667, 13.4],
# 'longitude': 13.4,
# 'latitude': 52.516667,
# 'accentcity': 'Berlin',
# 'population': 3398362}

get_city_opendata('San Francisco', 'us')

#{'city': 'san francisco',
# 'country': 'us',
# 'region': 'CA',
# 'geopoint': [37.775, -122.4183333],
# 'longitude': -122.4183333,
# 'latitude': 37.775,
# 'accentcity': 'San Francisco',
# 'population': 732072}

WikiData

The second solution uses the WikiData API and the qwikidata package.

Here, the country is given by its English name (or a part of it), see examples below.

I'm sure the SPARQL command can be written much more efficiently and elegantly (feel free to edit), but it does the job.

import qwikidata
import qwikidata.sparql

def get_city_wikidata(city, country):
    query = """
    SELECT ?city ?cityLabel ?country ?countryLabel ?population
    WHERE
    {
      ?city rdfs:label '%s'@en.
      ?city wdt:P1082 ?population.
      ?city wdt:P17 ?country.
      ?city rdfs:label ?cityLabel.
      ?country rdfs:label ?countryLabel.
      FILTER(LANG(?cityLabel) = "en").
      FILTER(LANG(?countryLabel) = "en").
      FILTER(CONTAINS(?countryLabel, "%s")).
    }
    """ % (city, country)

    res = qwikidata.sparql.return_sparql_query_results(query)
    out = res['results']['bindings'][0]
    return out

get_city_wikidata('Berlin', 'Germany')

#{'city': {'type': 'uri', 'value': 'http://www.wikidata.org/entity/Q64'},
# 'population': {'datatype': 'http://www.w3.org/2001/XMLSchema#decimal',
#  'type': 'literal',
#  'value': '3613495'},
# 'country': {'type': 'uri', 'value': 'http://www.wikidata.org/entity/Q183'},
# 'cityLabel': {'xml:lang': 'en', 'type': 'literal', 'value': 'Berlin'},
# 'countryLabel': {'xml:lang': 'en', 'type': 'literal', 'value': 'Germany'}}

get_city_wikidata('San Francisco', 'America')

#{'city': {'type': 'uri', 'value': 'http://www.wikidata.org/entity/Q62'},
# 'population': {'datatype': 'http://www.w3.org/2001/XMLSchema#decimal',
#  'type': 'literal',
#  'value': '805235'},
# 'country': {'type': 'uri', 'value': 'http://www.wikidata.org/entity/Q30'},
# 'cityLabel': {'xml:lang': 'en', 'type': 'literal', 'value': 'San Francisco'},
# 'countryLabel': {'xml:lang': 'en',
#  'type': 'literal',
#  'value': 'United States of America'}}

Both approaches return dictionaries from which you can extract the infos you need using basic Python.

Hope that helps!

Jess answered 28/11, 2019 at 18:16 Comment(3)
is it possible to use fully written country names instead of the abbreviation ('de', 'en', ...) for solution 1 (OpenDataSoft)? I only have the Cities and the Country in full name.Gullet
Could you also give me a hint how you concatenated the dynamic link? I could not find any information on that website how to define an own request link. They always refer to their APIGullet
I don't know much about OpenDataSoft, the API is explained (a little) here: help.opendatasoft.com/apis/ods-search-v1/#dataset-search-api My example query returns only "de" and not "Germany" or "Deutschland" so apparently those cannot be used in the request. Open in browser: public.opendatasoft.com/api/records/1.0/search/… Some useful components of the API/address string: &q=berlin (full-text search), facet=country&refine.country=de (country filter), &facet=city&refine.city=berlin (city filter)Jess
A
0
from urllib.request import urlopen
import json
import pycountry
import requests
from geopy.geocoders import Nominatim


def get_city_opendata(city, country):
    tmp = 'https://public.opendatasoft.com/api/records/1.0/search/?dataset=worldcitiespop&q=%s&sort=population&facet=country&refine.country=%s'
    cmd = tmp % (city, country)
    res = requests.get(cmd)
    dct = json.loads(res.content)
    out = dct['records'][0]['fields']
    return out


def getcode(cc):

    countries = {}
    for country in pycountry.countries:
        countries[country.name] = country.alpha_2

    codes = countries.get(cc)
    
    return codes


def getplace(lat, lon):
    key = "PUT YOUR OWN GOOGLE API KEY HERE" #PUT YOUR OWN GOOGLE API KEY HERE
    url = "https://maps.googleapis.com/maps/api/geocode/json?"
    url += "latlng=%s,%s&sensor=false&key=%s" % (lat, lon, key)
    v = urlopen(url).read()
    j = json.loads(v)
    components = j['results'][0]['address_components']
    country = town = None
    for c in components:
        if "country" in c['types']:
            country = c['long_name']
        if "postal_town" in c['types']:
            town = c['long_name']

    return town, country


address= input('Input an address or town name\t')
geolocator = Nominatim(user_agent="Your_Name")
location = geolocator.geocode(address)


locationLat = location.latitude
locationLon = location.longitude

towncountry = getplace(location.latitude, location.longitude)
mycity = towncountry[0]
mycountry = towncountry[1]


print(towncountry)
print(mycountry)
print(mycity)
mycccode = getcode(mycountry)
mycccode = mycccode.lower()
print(mycccode)

populationdict = get_city_opendata(address, mycccode)


population = populationdict.get('population')
print('population',population)

print(location.address)
print((location.latitude, location.longitude))

I am very grateful for the previous answers. I had to solve this issue too. My code above follows on from David's answer above, where he recommends the OpenDataSoft API. Apparently the Google API at this time doesn't provide population results.

The code which I used below is able to get population of a city, OpenDataSoft doesn't always return town populations.

My code combines code from a few answers to different questions that I found on stackoverflow.

You will need to get a google maps developer api key, and do relevant pip installs.

  1. Firstly this code gets the long,lat coordinates of any place name
    based on user input
  2. Then it uses those to get the country name off google maps
  3. Then it uses the country name to get the abbreviated 2
    letters for the country
  4. Then it sends the place name and the abbreviated 2 letters to get the population from the OpenDataSoft
Aton answered 1/4, 2021 at 23:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.