K-means clustering of spatially constrained data - skater in spdep package
Asked Answered
M

0

6

I want to cluster the codebook from a self-organizing map using k-means clustering. However, given the 'spatial' nature of the data, I want to constrain the clustering so that only contiguous nodes are clustered together. After looking around, I decided to try and use the function skater in the spdep package.

Here's an example of what I've been doing.

# the 'codebook' data obtained from the self-organizing map. 
# My grid is 15 by 15 nodes. 
data <- data.frame(var1=rnorm(15*15, mean = 0, sd = 1), var2=rnorm(15*15, mean = 5, sd = 2))

# creating a matrix with all edges listed 
# (so basically one row to show a connection between each pair of adjacent nodes) 
require(spdep)
nbs <- cell2nb(nrow=15, ncol=15)

edges <- data.frame(node=rep(1:(tt.grid$xdim*tt.grid$ydim), each=4))
edges$nb <- NA 
for (i in 1:(tt.grid$xdim*tt.grid$ydim)) {   
   vals <- nbs[[i]][1:4]   
   edges$nb[(i-1)*4+1] <- vals[1]   
   edges$nb[(i-1)*4+2] <- vals[2]   
   edges$nb[(i-1)*4+3] <- vals[3]   
   edges$nb[(i-1)*4+3] <-
   vals[4] } 
edges <- edges[which(!is.na(edges$nb)),] 
edges$from <- apply(edges[c("node", "nb")], 1, min) 
edges$to <- apply(edges[c("node", "nb")], 1, max) 
edges <- edges[c("to", "from")]
edges <- edges[!duplicated(edges),] 
edges <- as.matrix(edges)

I know the code above is really clumsy and not elegant (please bear with me). I tried using mstree(nb2listw(nbs))[,1:2] but it didn't list all the links. I'm not sure I quite understood what this was doing, so I created my matrix of edges manually.

Then I tried to use this matrix into the skater function

test <- skater(edges=edges, data=data, ncuts=5)

but I get the following error message: Error in colMeans(data[id, , drop = FALSE]) : error in evaluating the argument 'x' in selecting a method for function 'colMeans': Error in data[id, , drop = FALSE] : subscript out of bounds

However, if I use the mstree edges, I don't get an error message but the results don't make sense at all.

test <- skater(edges=mstree(nb2listw(nbs))[,1:2], data=data, ncuts=5)

Any help on this error message (or alternative suggestions as to how to do the spatially constrained clustering I would like to do) is much appreciated.

Mixie answered 20/7, 2015 at 17:51 Comment(1)
Your code doesn't run. Have you solved this problem? I'm also struggling with error messages and thin documentationEliciaelicit

© 2022 - 2024 — McMap. All rights reserved.