Avoiding multiple for-loops in R to calculate a matrix
Asked Answered
M

5

15

So in the course of generating some fake data to answer a map question, I found myself writing the following:

# Generate some fake data
lat <- seq(-90, 90, by = 5)
lon <- seq(-180, 180, by = 10)
phi <- matrix(0, nrow = length(lat), ncol = length(lon))
i <- 1
for (l1 in lat) {
    j <- 1
    for (l2 in lon) {
        phi[i, j] <- (sin(pi * l1 / 180) * cos(pi * l2 / 180))^2
        j <- j+1
    }
    i <- i+1
}
phi <- 1500*phi + 4500  # scale it properly

Now obviously those two central for-loops are not as R'ish as I would like. It seems like I should be able to get an mapply or something to do the job, but sadly that returns a list, and does not really do what I want. The other applys don't seem to do the right thing either.

What am I missing here?

Mensural answered 24/2, 2016 at 9:48 Comment(0)
L
17

You should try to use matrix algebra. No need to use any functions from the apply family:

lat <- seq(-90, 90, by = 5)
lon <- seq(-180, 180, by = 10)
1500 * tcrossprod(sin(pi * lat / 180), cos(pi * lon / 180))^2 + 4500
Locomobile answered 24/2, 2016 at 10:2 Comment(4)
The original was here: #35592766 - I gave you a byline.Mensural
Cheers! Must say the speed tests below were quite surprising.Locomobile
As well as the interest in the question. Sure floored me.Mensural
While I still like this answer a lot, and used it in my problem, I do now note that the outer solution below is actually more general.Mensural
L
10

you can use outer

   x = outer(lat, lon, FUN = function(x,y) {(sin(pi * x/180) * cos(pi * y /180))^2})
    identical(x * 1500 + 4500, phi)
# [1] TRUE

NBATrends's answer seems to be the faster than the other solution. Here some benchmark

library(microbenchmark) 
microbenchmark(within(df, {
  phi <- (sin(pi * lat / 180) * cos(pi * lon / 180))^2
  phi <- 1500*phi + 4500
}), 1500 * tcrossprod(sin(pi * lat / 180), cos(pi * lon / 180))^2 + 4500, outer(lat, lon, FUN = function(x,y) {(sin(pi * x/180) * cos(pi * y /180))^2}),
((as.matrix(l1)%*%t(as.matrix(l2)))^2) * 1500 + 4500)
Unit: microseconds
                                                                                              expr     min       lq      mean   median       uq     max neval
 within(df, {     phi <- (sin(pi * lat/180) * cos(pi * lon/180))^2     phi <- 1500 * phi + 4500 }) 255.670 262.0095 270.50948 266.6880 277.7060 385.467   100
                                  1500 * tcrossprod(sin(pi * lat/180), cos(pi * lon/180))^2 + 4500  11.471  12.3770  22.30177  12.9805  13.5850 868.130   100
               outer(lat, lon, FUN = function(x, y) {     (sin(pi * x/180) * cos(pi * y/180))^2 }) 137.645 139.7590 144.39520 141.5700 145.1925 179.905   100
                                            ((as.matrix(l1) %*% t(as.matrix(l2)))^2) * 1500 + 4500  16.301  17.6595  20.20390  19.6215  20.5270  80.294   100
Luddite answered 24/2, 2016 at 9:59 Comment(2)
By a lot too. Interesting.Mensural
Now that I have had time to mull things over, this answer (using outer) is in many ways the better and more general answer, since we can put an arbitrary function for x and y, and crossprod really just does functions that are a multiplactive product. crossprod is a lot faster where it applies though, and for my particular problem crossprod is a great fit, so I won't adjust the correct answer, and will leave it at this note.Mensural
S
7

Linear algebra might be simpler for your application, because you are just multiplying element-wise two vectors, which can be done through v * u^T. In R, the matrix multiplication is %*%.

lat <- seq(-90, 90, by = 5)
lon <- seq(-180, 180, by = 10)

l1 <- sin(pi * lat / 180) 
l2 <- s(pi * lon/ 180)

# compute the matrix
phi <- as.matrix(l1)%*%t(as.matrix(l2))
# square each element of the matrix
phi <- phi^2
# scale properly
# square each element of the matrix
phi <- 1500*phi + 4500  
Scimitar answered 24/2, 2016 at 10:8 Comment(0)
W
5

Why be attached to the matrix structure and use apply when you can vectorise?

df <- expand.grid(lat = seq(-90, 90, by = 5),
                 lon = seq(-180, 180, by = 10))
df <- within(df, {
  phi <- (sin(pi * lat / 180) * cos(pi * lon / 180))^2
  phi <- 1500*phi + 4500
  })

You can always convert back using the instructions here.

Wooley answered 24/2, 2016 at 9:59 Comment(0)
P
4

Using sapply(), but I would prefer outer() solution:

#using sapply
phi_1 <- 
  t(
    sapply(lat, function(l1)
      sapply(lon, function(l2)(sin(pi * l1 / 180) * cos(pi * l2 / 180))^2))
  ) * 1500 + 4500

#compare result
identical(phi_1, phi)
# [1] TRUE
Palliate answered 24/2, 2016 at 10:11 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.