Vectorization is the key difference between them. I will tray to explain this point. R is an high-level-interpreted computer language. It takes care of many basic computer tasks for you. When you write
x <- 2.0
you don’t have to tell your computer that
- “2.0” is a floating-point number;
- “x” should store numeric-type data;
- it has to find a place in memory to put “5”;
- it has to register “x” as a pointer to a certain place in memory.
R figures these things by itself.
But, for such comfortable issue, there is a price: it is slower than low level languages.
In C or FORTRAN, much of this "test if" would be accomplished during the compilation step, not during the program execution. They are translated into binary computer language (0/1) after they are written, BUT before they are run. This allows the compiler to organize the binary machine code in an optimal way for the computer to interpret.
What does this have to do with vectorization in R? Well, many R functions are actually written in a a compiled language, such as C, C++, and FORTRAN, and have a small R “wrapper”. This is the difference between yours approach. for
loops add further test if
operations that the machine has to do on data, making it slower
replicate
. I think that is because the major part of the code is spent in R. Reimplementing the whole loop aroundmean
, in e.g. C++ would probably speed things up quite a bit. See the benchmarks in my answer. – Alesandrini