US Census API - Get The Population of Every City in a State Using Python
Asked Answered
D

2

6

I'm having an issue getting the population of every city in a specific state. I do get the population of cities but if I sum the population in every city I don't get the same number as the population of the state.

I got my API Key used the P0010001 variable for total population used the FIPS 25 for the state of Massachusetts and requested the population by the geography level "place" which I understand it to mean city.

Here is the Python 3 code I used:

import urllib.request
import ast


class Census:
    def __init__(self, key):
        self.key = key

    def get(self, fields, geo, year=2010, dataset='sf1'):
        fields = [','.join(fields)]
        base_url = 'http://api.census.gov/data/%s/%s?key=%s&get=' % (str(year), dataset, self.key)
        query = fields
        for item in geo:
            query.append(item)
        add_url = '&'.join(query)
        url = base_url + add_url
        print(url)
        req = urllib.request.Request(url)
        response = urllib.request.urlopen(req)
        return response.read()

c = Census('<mykey>')
state = c.get(['P0010001'], ['for=state:25'])
# url: http://api.census.gov/data/2010/sf1?key=<mykey>&get=P0010001&for=state:25
county = c.get(['P0010001'], ['in=state:25', 'for=county:*'])
# url: http://api.census.gov/data/2010/sf1?key=<mykey>&get=P0010001&in=state:25&for=county:*
city = c.get(['P0010001'], ['in=state:25', 'for=place:*'])
# url: http://api.census.gov/data/2010/sf1?key=<mykey>&get=P0010001&in=state:25&for=place:*

# Cast result to list type
state_result = ast.literal_eval(state.decode('utf8'))
county_result = ast.literal_eval(county.decode('utf8'))
city_result = ast.literal_eval(city.decode('utf8'))

def count_pop_county():
    count = 0
    for item in county_result[1:]:
        count += int(item[0])
    return count

def count_pop_city():
    count = 0
    for item in city_result[1:]:
        count += int(item[0])
    return count

And here are the results:

print(state)
# b'[["P0010001","state"],\n["6547629","25"]]'

print('Total state population:', state_result[1][0])
# Total state population: 6547629

print('Population in all counties', count_pop_county())
# Population in all counties 6547629

print('Population in all cities', count_pop_city())
# Population in all cities 4615402

I'm reasonable sure that 'place' is the city e.g.

# Get population of Boston (FIPS is 07000)
boston = c.get(['P0010001'], ['in=state:25', 'for=place:07000'])
print(boston)
# b'[["P0010001","state","place"],\n["617594","25","07000"]]'

What am I doing wrong or misunderstanding? Why is the sum of populations by place not equal to the population of the state?

List of example API calls

Dmso answered 8/3, 2015 at 23:38 Comment(1)
Some people live outside cities...Patron
E
8

if I sum the population in every city I don't get the same number as the population of the state.

That's because not everybody lives in a city -- there are rural "unincorporated areas" in many counties that are not part of any city, and, people do live there.

So, this is not a programming problem!-)

Esteban answered 8/3, 2015 at 23:43 Comment(5)
So this is an API question. Do you know what geography parameter would yield populations for these unincorporated areas?Dmso
@Delicious, I believe you need to get the county pop then subtract the pop of cities within the county. At least, that's what I read between the lines at census.gov/population/www/documentation/twps0082/twps0082.html -- however it's not a brand-new study, so for all I know the API might have added the functionality you seek (but if they have, I can't find it in their docs).Esteban
This capability isn't available as today. Even the data available are outdated (2010)Swearword
@BastienBastiens, 2010 was the latest US census, the next one is scheduled for 2020, so, how's the 2010 census data "outdated"?! It is the official set of values used e.g for Congressional districting purposes, etc, until some time after 2020.Esteban
@AlexMartelli The data are by definition outdated because they are from 2010. I understand the last census was in 2010 but it would be really easy to create a more accurate estimation by extrapolation of the data using 2010 and the previous census. If your city had 100K residents in 2000 and 200K in 2010, chances are that your city has around 250K residents in 2015. In the above, 250K in 2015 would be more accurate than using the 200K from the latest census.Swearword
M
1

@Delicious -- the census has several levels of geography division available. I'm not immediately sure where the data API stops (Census goes down to individual blocks, but I believe the API does not, for Human Subjects reasons), but Census Tracts, Census Divisions, ZCTAs (Zip Code Tabulation Area -- basically a Zip Code for the map) would all cover geographic ranges, and include un-incorporated population at the sub-county level.

You can play with these various levels (and with a mapping tool) at the census data website: factfinder.census.gov --> Advanced Search.

Misshapen answered 23/11, 2015 at 8:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.