I need to calculate f(x)=exp(A*x)
repeatedly for a tiny, variable column vector x
and a huge, constant matrix A
(many rows, few columns). In other words, the x
are few, but the A*x
are many. My problem dimensions are such that A*x
takes about as much runtime as the exp() part.
Apart from Taylor expansion and pre-calculating a range of values exp(y)
(assuming known the range y
of values of A*x
), which I haven't managed to speed up considerably (while maintaining accuracy) with respect to what MATLAB is doing on its own, I am thinking about analytically restating the problem in order to be able to precalculate some values.
For example, I find that exp(A*x)_i = exp(\sum_j A_ij x_j) = \prod_j exp(A_ij x_j) = \prod_j exp(A_ij)^x_j
This would allow me to precalculate exp(A)
once, but the required exponentiation in the loop is as costly as the original exp()
function call, and the multiplications (\prod) have to be carried out in addition.
Is there any other idea that I could follow, or solutions within MATLAB that I may have missed?
Edit: some more details
A
is 26873856 by 81 in size (yes, it's that huge), so x
is 81 by 1.
nnz(A) / numel(A)
is 0.0012
, nnz(A*x) / numel(A*x)
is 0.0075
. I already use a sparse matrix to represent A
, however, exp() of a sparse matrix is not sparse any longer. So in fact, I store x
non-sparse and I calculate exp(full(A*x))
which turned out to be as fast/slow as full(exp(A*x))
(I think A*x
is non-sparse anyway, since x is non-sparse.) exp(full(A*sparse(x)))
is a way to have a sparse A*x
, but is slower. Even slower variants are exp(A*sparse(x))
(with doubled memory impact for a non-sparse matrix of type sparse) and full(exp(A*sparse(x))
(which again yields a non-sparse result).
sx = sparse(x);
tic, for i = 1 : 10, exp(full(A*x)); end, toc
tic, for i = 1 : 10, full(exp(A*x)); end, toc
tic, for i = 1 : 10, exp(full(A*sx)); end, toc
tic, for i = 1 : 10, exp(A*sx); end, toc
tic, for i = 1 : 10, full(exp(A*sx)); end, toc
Elapsed time is 1.485935 seconds.
Elapsed time is 1.511304 seconds.
Elapsed time is 2.060104 seconds.
Elapsed time is 3.194711 seconds.
Elapsed time is 4.534749 seconds.
Yes, I do calculate element-wise exp, I update the above equation to reflect that.
One more edit: I tried to be smart, with little success:
tic, for i = 1 : 10, B = exp(A*x); end, toc
tic, for i = 1 : 10, C = 1 + full(spfun(@(x) exp(x) - 1, A * sx)); end, toc
tic, for i = 1 : 10, D = 1 + full(spfun(@(x) exp(x) - 1, A * x)); end, toc
tic, for i = 1 : 10, E = 1 + full(spfun(@(x) exp(x) - 1, sparse(A * x))); end, toc
tic, for i = 1 : 10, F = 1 + spfun(@(x) exp(x) - 1, A * sx); end, toc
tic, for i = 1 : 10, G = 1 + spfun(@(x) exp(x) - 1, A * x); end, toc
tic, for i = 1 : 10, H = 1 + spfun(@(x) exp(x) - 1, sparse(A * x)); end, toc
Elapsed time is 1.490776 seconds.
Elapsed time is 2.031305 seconds.
Elapsed time is 2.743365 seconds.
Elapsed time is 2.818630 seconds.
Elapsed time is 2.176082 seconds.
Elapsed time is 2.779800 seconds.
Elapsed time is 2.900107 seconds.
A = rand(10000, 10)
. UsingexpA = exp(A); prod(bsxfun(@power, expA, x'), 2)
is indeed much slower. Doingexp(A*x)
in a mex function doesn't help much either, even using single-precision. What is the size ofA
for your usage? – Fortepianox
is a scalar andA
a vector. I do not see any way how that can be sped up. – Leontinex
changes butA
doesn't? is this going in the steps of compressive sensing? – Calvoy=exp(A*x) .* x
. Right or misunderstood? – Hexapla