Here's the naive way,
all0 <- function(x, FUN)
all(vapply(x, FUN, logical(1)))
and an R loop...
all1 <- function(x, FUN) {
for (xi in x)
if (!FUN(xi))
return(FALSE)
TRUE
}
...which can be compiled
library(compiler)
all1c <- cmpfun(all1)
...or written in C
library(inline)
allc <- cfunction(signature(x="list", fun="function"), "
SEXP call = PROTECT(lang2(fun, R_NilValue));
int len = Rf_length(x);
for (int i = 0; i < len; ++i) {
SETCADR(call, VECTOR_ELT(x, i));
if (!LOGICAL(eval(call, R_GlobalEnv))[0]) {
UNPROTECT(1);
return Rf_ScalarLogical(FALSE);
}
}
UNPROTECT(1);
return Rf_ScalarLogical(TRUE);")
We need to measure the performance, so
library(microbenchmark)
The worst case would seem to be that the condition passes
n <- 100000
x0 <- x <- vector("list", n)
microbenchmark(all0(x, is.null), all1(x, is.null), all1c(x, is.null),
allc(x, is.null))
## Unit: milliseconds
## expr min lq median uq max neval
## all0(x, is.null) 47.48038 50.58960 52.34946 54.10116 61.94736 100
## all1(x, is.null) 41.52370 44.40024 45.25135 46.68218 53.22317 100
## all1c(x, is.null) 33.76666 35.03008 35.71738 36.41944 45.37174 100
## allc(x, is.null) 13.95340 14.43153 14.78244 15.94688 19.41072 100
so we're only 2x as fast in C compared to the compiled R version -- there is a function call on each test, so we're only saving on the looping per se. The best case is when we exit immediately and clearly showing the advantage of the loop, but then neither compilation nor C code helps us
x[[1]] <- FALSE
microbenchmark(all0(x, is.null), all1(x, is.null), all1c(x, is.null),
allc(x, is.null))
## Unit: microseconds
## expr min lq median uq max neval
## all0(x, is.null) 45376.760 45772.5020 46108.5795 46655.005 54242.687 100
## all1(x, is.null) 1.566 1.9550 2.6335 12.015 14.177 100
## all1c(x, is.null) 1.367 1.7340 2.0345 9.359 17.438 100
## allc(x, is.null) 1.229 1.6925 4.6955 11.628 23.378 100
Here's an intermediate case, which does not really contain any surprises -- the C loop is about 2x faster than the compiled R loop, so gets there about 2x as quickly.
x <- x0
x[[length(x)/2]] <- FALSE
microbenchmark(all0(x, is.null), all1(x, is.null), all1c(x, is.null),
allc(x, is.null))
## Unit: milliseconds
## expr min lq median uq max neval
## all0(x, is.null) 46.85690 49.92969 51.045519 52.653137 59.445611 100
## all1(x, is.null) 20.90066 21.92357 22.582636 23.077863 25.974395 100
## all1c(x, is.null) 16.51897 17.44539 17.825551 18.119202 20.535709 100
## allc(x, is.null) 6.98468 7.18392 7.312575 8.290859 9.460558 100
Explicitly testing for NULL at the C level (VECTOR_ELT(x, i) == R_NilValue
) is very fast, so C code that compares the value to NULL is about 100x faster than the corresponding R code. It would seem that allNULL might be a worth-while generalization if speed were of the essence, but the case for a general-purpose C-level all doesn't seem so compelling. And of course the C code doesn't deal with NA or error conditions.
all_fast
function inRcpp
? – Manheimlapply
that breaks its loop when the evaluated function on an element gives "TRUE" (or "FALSE" for the case ofall
)? Perhaps, you could get some ideas from do_lapply and write something similar? – GenitivePROTECT(ans = allocVector(LGLSXP, 1))
, saveeval(R_fcall, rho)
in a "SEXP tmp" andif(LOGICAL(AS_LOGICAL(tmp))[0] == 0) {LOGICAL(ans)[0] = 0; break;}
. In this case, though, I guess you can't handle...
arguments. – Genitive