I have a vector and want to find the position of the first value that is greater than 100.
# Randomly generate a suitable vector
set.seed(0)
v <- sample(50:150, size = 50, replace = TRUE)
min(which(v > 100))
Most answers based on which
and max
are slow (especially for long vectors) as they iterate through the entire vector:
x>100
evaluates every value in the vector to see if it matches the conditionwhich
andmax
/min
search all the indexes returned at step 1. and find the maximum/minimum
Position
will only evaluate the condition until it encounters the first TRUE value and immediately return the corresponding index, without continuing through the rest of the vector.
# Randomly generate a suitable vector
v <- sample(50:150, size = 50, replace = TRUE)
Position(function(x) x > 100, v)
?Position
says: "The current implementation is not optimized for performance." So I guess it also evaluates the whole vector. –
Mccowyn Check out which.max
:
x <- seq(1, 150, 3)
which.max(x > 100)
# [1] 35
x[35]
# [1] 103
?which.max
: 'However, match(FALSE, x) or match(TRUE, x) are typically preferred, as they do indicate mismatches.' => match(TRUE, x>100)
–
Periodic Just to mention, Hadley Wickham has implemented a function, detect_index
, to do exactly this task in his purrr
package for functional programming.
I recently used detect_index
myself and would recommend it to anyone else with the same problem.
Documentation for detect_index
can be found here: https://rdrr.io/cran/purrr/man/detect.html
purrr::detect_index(seq(1, 150, 3), function(x) x > 100)
. Hadley's packages are optimized for readability, but certainly not for speed. –
Flaming As I need to perform a similar calculation many times within a loop, I was interested in which of the many answers provided in this thread would be most efficient.
TLDR:
Whether the first value appears early or late in a vector, which.max(v > 100)
is the fastest solution to this problem.
Note, however, that if no entry in v
exceeds 100, it will return 1; thus there may be cause for
SafeWhichMax <- function (v) {
first <- which.max(v > 100)
if (first == 1L && v[1] <= 100) NA else first
}
SafeWhichMax(100) # NA
SafeWhichMax(101) # 1
If a vector is very long and is not guaranteed to contain any TRUE
results, match(TRUE, v > 100)
may be quicker than which.max()
with checks.
# Short vector:
v <- 0:105
microbenchmark(
which.max(v > 100),
match(TRUE, v > 100),
min(which(v > 100)),
which(v > 100)[1],
Position(function(x) x, v > 100),
Position(function(x) x > 100, v),
purrr::detect_index(v, function (x) x > 100)
)
Unit: microseconds
mean median
which.max(v > 100) 24.112 23.80
SafeWhichMax(v) 24.889 24.25
match(TRUE, v > 100) 34.752 33.20
min(which(v > 100)) 25.506 25.20
which(v > 100)[1] 25.320 24.90
Position(function(x) x, v > 100) 3231.783 3043.50
Position(function(x) x > 100, v) 3487.805 3314.75
purrr::detect_index 16436.579 16064.90
# Long vector, with late first occurrence of v > 100
v <- -10000:105
Unit: microseconds
mean median
which.max(v > 100) 24.958 24.30
SafeWhichMax(v) 25.456 24.90
match(TRUE, v > 100) 37.680 37.85
min(which(v > 100)) 26.439 26.00
which(v > 100)[1] 25.724 25.55
Position(function(x) x, v > 100) 3224.240 3036.50
Position(function(x) x > 100, v) 3389.538 3287.05
purrr::detect_index 17344.706 15283.35
There are many solutions, another is:
x <- 90:110
which(x > 100)[1]
Assuming values is your vector.
firstGreatearThan <- NULL
for(i in seq(along=values)) {
if(values[i] > 100) {
firstGreatearThan <- i
break
}
}
break
–
Immigration © 2022 - 2024 — McMap. All rights reserved.
^
for functional programming – Open