format/round numerical legend label in GeoPandas
Asked Answered
F

2

7

I'm looking for a way to format/round the numerical legend labels in those maps produced by .plot() function in GeoPandas. For example:

gdf.plot(column='pop2010', scheme='QUANTILES', k=4)

This gives me a legend with many decimal places:

enter image description here

I want the legend label to be integers.

Fourchette answered 25/9, 2018 at 17:37 Comment(0)
E
12

As I recently encountered the same issue, and a solution does not appear to be readily available on Stack Overflow or other sites, I thought I would post the approach I took in case it is useful.

First, a basic plot using the geopandas world map:

# load world data set    
world_orig = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
world = world_orig[(world_orig['pop_est'] > 0) & (world_orig['name'] != "Antarctica")].copy()
world['gdp_per_cap'] = world['gdp_md_est'] / world['pop_est']

# basic plot
fig = world.plot(column='pop_est', figsize=(12,8), scheme='fisher_jenks', 
                 cmap='YlGnBu', legend=True)
leg = fig.get_legend()
leg._loc = 3
plt.show()

world map v1

The method I used relied on the get_texts() method for the matplotlib.legend.Legend object, then iterating over the items in leg.get_texts(), splitting the text element into the lower and upper bounds, and then creating a new string with formatting applied and setting this with the set_text() method.

# formatted legend
fig = world.plot(column='pop_est', figsize=(12,8), scheme='fisher_jenks', 
                 cmap='YlGnBu', legend=True)
leg = fig.get_legend()
leg._loc = 3

for lbl in leg.get_texts():
    label_text = lbl.get_text()
    lower = label_text.split()[0]
    upper = label_text.split()[2]
    new_text = f'{float(lower):,.0f} - {float(upper):,.0f}'
    lbl.set_text(new_text)

plt.show()

This is very much a 'trial and error' approach, so I wouldn't be surprised if there were a better way. Still, perhaps this will be helpful.

world map v2

Elsewhere answered 14/6, 2019 at 3:43 Comment(4)
Thanks buddy. btw, could you please use f-strings?Fourchette
I finally got a chance to read the paysal doc and update a solution. Please take a look.Fourchette
The solution from steven below is perhaps more systematic, but I liked your solution as a small "fixup", only modifying the final plot. In case anyone is trying this with subplots, e.g. fig, ax = plt.subplots(1, 1,figsize=(10,12)), use leg = ax.get_legend() to get legend, not leg = fig.get_legend().Discounter
forgot to say: I also had to remove leg._loc = 3, otherwise I would get a ValueError: too many values to unpack (expected 2). However, with fig = world.plot(...legend=True, legend_kwds={'loc': 'lower right'}) it works.Discounter
F
8

Method 1

According to geopandas's changelog, you can pass a fmt in legend_kwds since version 0.8.0 (June 24, 2020) to format the legend labels. For example, if you want no decimal point, you can set fmt='{:.0f}', like how you format numbers with a f-string. Here's an example for a quantiles map:

import matplotlib.pyplot as plt
import numpy as np
import mapclassify
import geopandas as gpd

gdf = gpd.read_file(
    gpd.datasets.get_path('naturalearth_lowres')
)
np.random.seed(0)
gdf = gdf.assign(
    random_col=np.random.normal(100, 10, len(gdf))
)

# plot quantiles map
fig, ax = plt.subplots(figsize=(10, 10))
gdf.plot(
    column='random_col',
    scheme='quantiles', k=5, cmap='Blues',
    legend=True,
    legend_kwds=dict(fmt='{:.0f}', interval=True),
    ax=ax
)

This gives us: enter image description here


Method 2

In fact, GeoPandas uses PySal's mapclassify to calculate and generate map legends. For the quantiles map (k=5) above, we can get the classification via .Quantiles() in mapclassify.

mapclassify.Quantiles(gdf.random_col, k=5)

The function returns an object of mapclassify.classifiers.Quantiles:

Quantiles               

    Interval       Count
------------------------
[ 74.47,  91.51] |    36
( 91.51,  97.93] |    35
( 97.93, 103.83] |    35
(103.83, 109.50] |    35
(109.50, 123.83] |    36

The object has an attribute bins, which returns an numpy array containing the upper bounds in all classes.

array([ 91.51435701,  97.92957441, 103.83406507, 109.49954895,
       123.83144775])

Thus, we can use this function to get all the bounds of the classes since the upper bound in a lower class equals the lower bound in the higher class. The only one missing is the lower bound in the lowest class, which equals the minimum value of the column you are trying to classify in your DataFrame. Here's an example to round all numbers to integers:

# get all upper bounds
upper_bounds = mapclassify.Quantiles(gdf.random_col, k=5).bins
# insert minimal value in front to get all bounds
bounds = np.insert(upper_bounds, 0, gdf.random_col.min())
# format the numerical legend here
intervals = [
    f'{bounds[i]:.0f}-{bounds[i+1]:.0f}' for i in range(len(bounds)-1)
]

# get all the legend labels
legend_labels = ax.get_legend().get_texts()
# replace the legend labels
for interval, legend_label in zip(intervals, legend_labels):
    legend_label.set_text(interval)

We will eventually get: enter image description here

As you can see, since we are doing things in a lower level, we are able to customize how the legend labels look like, such as removing those brackets but using a - in the middle.


Method 3

In addition to GeoPandas' .plot() method, you can also consider .choropleth() function offered by geoplot in which you can easily use different types of scheme and number of classes while passing a legend_labels arg to modify the legend labels. For example,

import geopandas as gpd
import geoplot as gplt

gdf = gpd.read_file(
    gpd.datasets.get_path('naturalearth_lowres')
)

legend_labels = [
    '< 2.4', '2.4 - 6', '6 - 15', '15 - 38', '38 - 140 M'
]
gplt.choropleth(
    gdf, hue='pop_est', cmap='Blues', scheme='quantiles',
    legend=True, legend_labels=legend_labels
)

which gives you

enter image description here

Fourchette answered 26/6, 2019 at 14:7 Comment(3)
Indeed. You need to change to the corresponding classification function in pysal. This geopands doc explainsFourchette
Method 1 worked for me - thanks! I converted it to "percentiles" which meant removing the k parameter and works perfectly. It's one of those problems you think should be much easier to solve! Great work around though.Humperdinck
@Humperdinck please see the updated answer and I didn't realize the updates in geopandas before.Fourchette

© 2022 - 2024 — McMap. All rights reserved.