It seems as though Mathworks has special cased squares in its power function (unfortunately, it's all builtin closed source that we cannot see). In my testing on R2013b, it appears as though .^
, power
, and realpow
use the same algorithm. For squares, I believe they have special-cased it to be x.*x
.
1.0x (4.4ms): @()x.^2
1.0x (4.4ms): @()power(x,2)
1.0x (4.5ms): @()x.*x
1.0x (4.5ms): @()realpow(x,2)
6.1x (27.1ms): @()exp(2*log(x))
For cubes, the story is different. They're no longer special-cased. Again, .^
, power
, and realpow
all are similar, but much slower this time:
1.0x (4.5ms): @()x.*x.*x
1.0x (4.6ms): @()x.*x.^2
5.9x (26.9ms): @()exp(3*log(x))
13.8x (62.3ms): @()power(x,3)
14.0x (63.2ms): @()x.^3
14.1x (63.7ms): @()realpow(x,3)
Let's jump up to the 16th power to see how these algorithms scale:
1.0x (8.1ms): @()x.*x.*x.*x.*x.*x.*x.*x.*x.*x.*x.*x.*x.*x.*x.*x
2.2x (17.4ms): @()x.^2.^2.^2.^2
3.5x (27.9ms): @()exp(16*log(x))
7.9x (63.8ms): @()power(x,16)
7.9x (63.9ms): @()realpow(x,16)
8.3x (66.9ms): @()x.^16
So: .^
, power
and realpow
all run in a constant time with regards to the exponent, unless it was special cased (-1 also appears to have been special cased). Using the exp(n*log(x))
trick is also constant time with regards to the exponent, and faster. The only result I don't quite understand why the repeated squaring is slower than the multiplication.
As expected, increasing the size of x
by a factor of 100 increases the time similarly for all algorithms.
So, moral of the story? When using scalar integer exponents, always do the multiplication yourself. There's a whole lot of smarts in power
and friends (exponent can be floating point, vector, etc). The only exceptions are where Mathworks has done the optimization for you. In 2013b, it seems to be x^2
and x^(-1)
. Hopefully they'll add more as time goes on. But, in general, exponentiation is hard and multiplication is easy. In performance sensitive code, I don't think you can go wrong by always typing x.*x.*x.*x
. (Of course, in your case, follow Luis` advice and make use of the intermediate results within each term!)
function powerTest(x)
f{1} = @() x.*x.*x.*x.*x.*x.*x.*x.*x.*x.*x.*x.*x.*x.*x.*x;
f{2} = @() x.^2.^2.^2.^2;
f{3} = @() exp(16.*log(x));
f{4} = @() x.^16;
f{5} = @() power(x,16);
f{6} = @() realpow(x,16);
for i = 1:length(f)
t(i) = timeit(f{i});
end
[t,idxs] = sort(t);
fcns = f(idxs);
for i = 1:length(fcns)
fprintf('%.1fx (%.1fms):\t%s\n',t(i)/t(1),t(i)*1e3,func2str(fcns{i}));
end
x.*x.*x.*x
behaves strangely. I have triedx.*.x.* ... .*x
with varying numbers of "x" from 2 to 8, and time is more or less linearly increasing. I would have expected bumps; for example the "8" case (=>x.^2.^2.^2
: three power operations) should take less time than "7" (=> more power operations) – Greeting