ggplot centered names on a map
Asked Answered
R

4

19

I'm attempting to use ggplot2 and maps to plot the names of the counties in NY state. My approach was to find the means of latitude and longitude by county (I assume this is the center of the county but this may be faulty thinking) and then use geom_text to plot the names on the map. It's not behaving as I anticipated as it's plotting multiple names per county.

The outcome I'm looking for is that the center of each text (county) is at the center of it's respective county.

In addition to solving the problem I'd appreciate helping to understand what's wrong with my thinking with ggplot.

Thank you in advance.

library(ggplot2); library(maps)

county_df <- map_data('county')  #mappings of counties by state
ny <- subset(county_df, region=="new york")   #subset just for NYS
ny$county <- ny$subregion
cnames <- aggregate(cbind(long, lat) ~ subregion, data=ny, FUN=mean)

p <- ggplot(ny, aes(long, lat, group=group)) +  geom_polygon(colour='black', fill=NA) 
p #p of course plots as expected

#now add some county names (3 wrong attempts)
p + geom_text(aes(long, lat, data = cnames, label = subregion, size=.5)) #not correct

#I said maybe I'm confusing it with the same names for different data sets
names(cnames) <-c('sr', 'Lo', 'La')
p + geom_text(Lo, La, data = cnames, label = sr, aes(size=.5)) #attempt 2
p + geom_text(aes(Lo, La, data = cnames, label = sr, size=.5)) #attempt 3
Robbierobbin answered 25/2, 2012 at 5:6 Comment(0)
B
33

Since you are creating two layers (one for the polygons and the second for the labels), you need to specify the data source and mapping correctly for each layer:

ggplot(ny, aes(long, lat)) +  
    geom_polygon(aes(group=group), colour='black', fill=NA) +
    geom_text(data=cnames, aes(long, lat, label = subregion), size=2)

Note:

  • Since long and lat occur in both data frames, you can use aes(long, lat) in the first call to ggplot. Any mapping you declare here is available to all layers.
  • For the same reason, you need to declare aes(group=group) inside the polygon layer.
  • In the text layer, you need to move the data source outside the aes.

Once you've done that, and the map plots, you'll realize that the midpoint is better approximated by the mean of range, and to use a map coordinate system that respects the aspect ratio and projection:

cnames <- aggregate(cbind(long, lat) ~ subregion, data=ny, 
                    FUN=function(x)mean(range(x)))

ggplot(ny, aes(long, lat)) +  
    geom_polygon(aes(group=group), colour='black', fill=NA) +
    geom_text(data=cnames, aes(long, lat, label = subregion), size=2) +
    coord_map()

enter image description here

Badmouth answered 25/2, 2012 at 5:25 Comment(2)
I think you've done what I asked and more. So I'm marking this thread as solved. Thank you. I'm still not happy with the placement of the names and now realize that I need a better approach to centering. Justin's approach looks interesting. I'm going to post another question on some improved centering techniques.Robbierobbin
Even better for the midpoint is the centroid function in the geosphere package. Here's what I did instead of the aggregate function in this answer: cnames <- ddply(ia_pop, .(County, group), summarize, Centroid=centroid(cbind(long, lat))) and then split out the Centroid column like this: cnames$long <- cnames$Centroid[,1] and cnames$lat <- cnames$Centroid[,2]Donelson
D
6

I know this is an old question that's been answered, but I wanted to add this in case anyone looks here for future help.

The maps package has the map.text function, which uses polygon centroids to place labels. Looking at its code, one can see that it uses the apply.polygon and centroid.polygon functions to find the centroids. These functions aren't visible when the package is loaded, but can still be accessed:

library(ggplot2); library(maps)

county_df <- map_data('county')  #mappings of counties by state
ny <- subset(county_df, region=="new york")   #subset just for NYS
ny$county <- ny$subregion
cnames <- aggregate(cbind(long, lat) ~ subregion, data=ny, FUN=mean)

# Use the map function to get the polygon data, then find the centroids
county_poly <- map("county", "new york", plot=FALSE, fill = TRUE)
county_centroids <- maps:::apply.polygon(county_poly, maps:::centroid.polygon)

# Create a data frame for graphing out of the centroids of each polygon
# with a non-missing name, since these are the major county polygons.
county_centroids <- county_centroids[!is.na(names(county_centroids))]
centroid_array <- Reduce(rbind, county_centroids)
dimnames(centroid_array) <- list(gsub("[^,]*,", "", names(county_centroids)),
                                 c("long", "lat"))
label_df <- as.data.frame(centroid_array)
label_df$county <- rownames(label_df)

p <- ggplot(ny, aes(long, lat, group=group)) + geom_polygon(colour='black', fill=NA) 

plabels <- geom_text(data=label_df, aes(label=county, group=county))
p + plabels
Drabbet answered 18/5, 2015 at 14:24 Comment(2)
It would be slightly more helpful if the code included the line for what p is. Also I'm getting an error when it tries to add p + plabels: Error in eval(expr, envir, enclos) : object 'group' not foundTrefor
@henry-e Changes made, though long overdue on my part.Drabbet
T
4

It was pointed out to me by @tjebo while I was trying out to make a new stat, that this stat would be an appropriate solution for this question. It's not on CRAN (yet) but lives on github. (disclaimer: I wrote ggh4x)

For other people dealing with a similar problem, here is how that would work:

library(ggh4x)
#> Loading required package: ggplot2
#> Warning: package 'ggplot2' was built under R version 4.0.2
library(maps)

county_df <- map_data('county')
ny <- subset(county_df, region=="new york")
ny$county <- ny$subregion


ggplot(ny, aes(x = long, y = lat, group = group)) +  
  geom_polygon(colour='black', fill=NA) +
  stat_midpoint(aes(label = subregion), geom = "text",size=3) +
  coord_map()

Created on 2020-07-06 by the reprex package (v0.3.0)

Turcotte answered 6/7, 2020 at 21:32 Comment(0)
G
0

It sorta seems like kmeans centers would be useful... Here is a poor start... its late!

center.points <- ddply(ny, .(group), function(df) kmeans(df[,1:2], centers=1)$centers)    
center.points$county <- ny$county[ny$group == center.points$group]
p + geom_text(data=center.points, aes(x=V1, y=V2, label=county))
Glace answered 25/2, 2012 at 5:24 Comment(1)
Don't look at it... its hideous!Glace

© 2022 - 2024 — McMap. All rights reserved.