Merge data frame with SpatialPolygonsDataFrame
Asked Answered
T

2

13

I want to merge a SpatialPolygonsDataFrame :

# From https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html
states <- readOGR(dsn = "./cb_2014_us_state_20m.shp",
                  layer = "cb_2014_us_state_20m", verbose = FALSE)

with a normal data frame:

my_counts <- data.frame(
  State = c(
    "CA", "TX", "IL", "FL", "NY", "OH",
    "NJ", "GA", "MI", "PA", "MA", "CO", "AZ", "NC", "VA", "WA", "IN",
    "MD", "MN", "WI", "MO", "TN", "IA", "KY", "LA", "SC", "CT", "AL",
    "KS", "OR", "OK", "AR", "NV", "UT", "NE", "ID", "MS", "DC", "NM",
    "NH", "ME", "AK", "RI", "MT", "HI", "WV", "SD", "ND", "DE", "VT",
    "WY", "PR", "GU", "VI", "MP", "AS", "na", "MH", "FM", "PW"
  ),
  count = c(
    1590533L, 1016328L, 754535L, 742603L, 714205L,
    538719L, 477278L, 452064L, 437162L, 428616L, 420332L, 391084L,
    380853L, 354601L, 342533L, 335505L, 294670L, 286026L, 273427L,
    246172L, 238968L, 236037L, 235030L, 209514L, 199013L, 191707L,
    185521L, 179931L, 163477L, 159862L, 142610L, 136006L, 120111L,
    117338L, 112671L, 106176L, 102564L, 100168L, 97496L, 69881L,
    69508L, 68684L, 65631L, 62109L, 61123L, 57300L, 57254L, 56091L,
    51696L, 33944L, 32136L, 4822L, 598L, 468L, 49L, 19L, 17L,
    11L, 2L, 1L
  )
)

The goal is to use the result to make a map with leaflet

I tried sp::merge

 df1 <- sp::merge(x= states, y=my_counts)

but I get an error:

Error in table(y[, by.y]) : attempt to set an attribute on NULL
Trickery answered 25/8, 2015 at 22:19 Comment(1)
One more tip (since @bondeddust nailed the answer) is to use stringsAsFactors=FALSE in the readOGR call and in the data.frame creation to avoid potential factor/character issues as you manipulate the data.Fedora
D
17

Caveat: I've never done this before so I'm "feeling my way around". First look at the object-states:

Note: this was with rgdal_0.9-3 and sp_1.1-1 loaded under R 3.2.1 (and with GDAL installed on my OSX system, from kingchaos, IIRC):

> str(states)
Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots
  ..@ data       :'data.frame': 52 obs. of  9 variables:
  .. ..$ STATEFP : Factor w/ 52 levels "01","02","04",..: 5 9 10 11 13 14 16 18 19 21 ...
  .. ..$ STATENS : Factor w/ 52 levels "00068085","00294478",..: 22 17 2 18 27 28 29 30 16 19 ...
  .. ..$ AFFGEOID: Factor w/ 52 levels "0400000US01",..: 5 9 10 11 13 14 16 18 19 21 ...
  .. ..$ GEOID   : Factor w/ 52 levels "01","02","04",..: 5 9 10 11 13 14 16 18 19 21 ...
  .. ..$ STUSPS  : Factor w/ 52 levels "AK","AL","AR",..: 5 8 10 11 14 15 13 18 19 21 ...
  .. ..$ NAME    : Factor w/ 52 levels "Alabama","Alaska",..: 5 9 10 11 13 14 16 18 19 21 ...
  .. ..$ LSAD    : Factor w/ 1 level "00": 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ ALAND   : num [1:52] 4.03e+11 1.58e+08 1.39e+11 1.49e+11 2.14e+11 ...
  .. ..$ AWATER  : num [1:52] 2.05e+10 1.86e+07 3.14e+10 4.95e+09 2.40e+09 ...
  ..@ polygons   :List of 52
  .. ..$ :Formal class 'Polygons' [package "sp"] with 5 slots
  .. .. .. ..@ Polygons :List of 6
  .. .. .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
  .. .. .. .. .. .. ..@ labpt  : num [1:2] -118.4 33.4
  .. .. .. .. .. .. ..@ area   : num 0.0259
  .. .. .. .. .. .. ..@ hole   : logi FALSE
#####   Snipped rest of output ............................

So after looking for help on merge and reading:

 ?merge   # and choosing the option for:

Merge a Spatial* object having attributes with a data.frame
(in package sp in library /Library/Frameworks/R.framework/Versions/3.2/Resources/library)

I decided to try (and appear to have succeeded:

> newobj <- merge(states, my_counts, by.x="STUSPS", by.y="State")
Warning message:
In .local(x, y, ...) : 8 records in y cannot be matched to x

> names(newobj@data)
 [1] "STUSPS"   "STATEFP"  "STATENS"  "AFFGEOID" "GEOID"    "NAME"    
 [7] "LSAD"     "ALAND"    "AWATER"   "count"   

The warning makes sense. You seem to have some extra "States" not anticipated by the authors of that "states" shp-file:

> length( table(my_counts$State))
[1] 60
> length( unique(states@data$STUSPS) )
[1] 52

The moral

You should look at the names-values in the two objects when you are merging:

> names(states)
[1] "STATEFP"  "STATENS"  "AFFGEOID" "GEOID"    "STUSPS"   "NAME"     "LSAD"    
[8] "ALAND"    "AWATER"  

> names(my_counts)
[1] "State" "count"
Druci answered 26/8, 2015 at 2:33 Comment(2)
You can also work with the @data slot directly (not recommended unless one knows what they're doing) and the real key for this procedure is to also not mess with the order of the rows OR the rownames.Fedora
Thanks for this answer! I was expecting this to be much more complicated.Gorgoneion
D
0

maybe you should add the argument "incomparable" as in the example:

"merge(x, y, by=intersect(names(x), names(y)),

by.x=by, by.y=by, all.x=TRUE, suffixes = c(".x",".y"), incomparables=NULL, ...)"

Devy answered 26/8, 2015 at 4:41 Comment(1)
You are missing the essential point: that the default for by will NOT succeed.Druci

© 2022 - 2024 — McMap. All rights reserved.