I wrote a basic package for Automatic Differentation in R, called madness. While it is primarily designed for the multivariate delta method, it should also be usable for computing gradients automatically. An example usage:
require(madness)
# the 'fit' is the Frobenius norm of Y - L*R
# with a penalty for strongly negative R.
compute_fit <- function(R,L,Y) {
Rmad <- madness(R)
Err <- Y - L %*% Rmad
penalty <- sum(exp(-0.1 * Rmad))
fit <- norm(Err,'f') + penalty
}
set.seed(1234)
R <- array(runif(5*20),dim=c(5,20))
L <- array(runif(1000*5),dim=c(1000,5))
Y <- array(runif(1000*20),dim=c(1000,20))
ftv <- compute_fit(R,L,Y)
show(ftv)
show(val(ftv))
show(dvdx(ftv))
I get back the following:
class: madness
d (norm((numeric - (numeric %*% R)), 'f') + (sum(exp((numeric * R)), na.rm=FALSE) + numeric))
calc: ------------------------------------------------------------------------------------------------
d R
val: 207.6 ...
dvdx: 4.214 4.293 4.493 4.422 4.672 2.899 2.222 2.565 2.854 2.718 4.499 4.055 4.161 4.473 4.069 2.467 1.918 2.008 1.874 1.942 0.2713 0.2199 0.135 0.03017 0.1877 5.442 4.81
1 5.472 5.251 4.674 1.933 1.62 1.79 1.902 1.665 5.232 4.435 4.789 5.183 5.084 3.602 3.477 3.419 3.592 3.376 4.109 3.937 4.017 3.816 4.2 1.776 1.784 2.17 1.975 1.699 4.365 4
.09 4.475 3.964 4.506 1.745 1.042 1.349 1.084 1.237 3.1 2.575 2.887 2.524 2.902 2.055 2.441 1.959 2.467 1.985 2.494 2.223 2.124 2.275 2.546 3.497 2.961 2.897 3.111 2.9 4.44
2 3.752 3.939 3.694 4.326 0.9582 1.4 0.8971 0.8756 0.9019 2.399 2.084 2.005 2.154 2.491 ...
varx: ...
[,1]
[1,] 207.6
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27]
[1,] 4.214 4.293 4.493 4.422 4.672 2.899 2.222 2.565 2.854 2.718 4.499 4.055 4.161 4.473 4.069 2.467 1.918 2.008 1.874 1.942 0.2713 0.2199 0.135 0.03017 0.1877 5.442 4.811
[,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37] [,38] [,39] [,40] [,41] [,42] [,43] [,44] [,45] [,46] [,47] [,48] [,49] [,50] [,51] [,52] [,53] [,54]
[1,] 5.472 5.251 4.674 1.933 1.62 1.79 1.902 1.665 5.232 4.435 4.789 5.183 5.084 3.602 3.477 3.419 3.592 3.376 4.109 3.937 4.017 3.816 4.2 1.776 1.784 2.17 1.975
[,55] [,56] [,57] [,58] [,59] [,60] [,61] [,62] [,63] [,64] [,65] [,66] [,67] [,68] [,69] [,70] [,71] [,72] [,73] [,74] [,75] [,76] [,77] [,78] [,79] [,80] [,81]
[1,] 1.699 4.365 4.09 4.475 3.964 4.506 1.745 1.042 1.349 1.084 1.237 3.1 2.575 2.887 2.524 2.902 2.055 2.441 1.959 2.467 1.985 2.494 2.223 2.124 2.275 2.546 3.497
[,82] [,83] [,84] [,85] [,86] [,87] [,88] [,89] [,90] [,91] [,92] [,93] [,94] [,95] [,96] [,97] [,98] [,99] [,100]
[1,] 2.961 2.897 3.111 2.9 4.442 3.752 3.939 3.694 4.326 0.9582 1.4 0.8971 0.8756 0.9019 2.399 2.084 2.005 2.154 2.491
Note that the derivative is the derivative of the scalar with respect to a 5 x 20 matrix, but is flattened here to a vector. (Unfortunately a row vector.)
hessian
parameter in optim. Is this not what you want? – Blinding