Find position of first value greater than X in a vector
Asked Answered
B

7

33

I have a vector and want to find the position of the first value that is greater than 100.

Boice answered 1/4, 2015 at 10:23 Comment(0)
E
55
# Randomly generate a suitable vector
set.seed(0)
v <- sample(50:150, size = 50, replace = TRUE)

min(which(v > 100))
Externalize answered 1/4, 2015 at 10:34 Comment(0)
C
34

Most answers based on which and max are slow (especially for long vectors) as they iterate through the entire vector:

  1. x>100 evaluates every value in the vector to see if it matches the condition
  2. which and max/min search all the indexes returned at step 1. and find the maximum/minimum

Position will only evaluate the condition until it encounters the first TRUE value and immediately return the corresponding index, without continuing through the rest of the vector.

# Randomly generate a suitable vector
v <- sample(50:150, size = 50, replace = TRUE)

Position(function(x) x > 100, v)
Cursed answered 23/3, 2016 at 23:13 Comment(4)
^ for functional programmingOpen
Side note: ?Position says: "The current implementation is not optimized for performance." So I guess it also evaluates the whole vector.Mccowyn
@Mccowyn - it uses a for loop. If you just run "Position" (the bare name) it will print out the implementation.Hircine
@Hircine and evaluates the function every time, both slow operations in R. Unless you specifically expect the match to occur at the beginning of the vector, I'd guess this is likely going to be slower than the vectorized version.Threewheeler
M
16

Check out which.max:

x <- seq(1, 150, 3)
which.max(x > 100)
# [1] 35
x[35]
# [1] 103
Mccowyn answered 1/4, 2015 at 10:37 Comment(1)
?which.max: 'However, match(FALSE, x) or match(TRUE, x) are typically preferred, as they do indicate mismatches.' => match(TRUE, x>100)Periodic
C
6

Just to mention, Hadley Wickham has implemented a function, detect_index, to do exactly this task in his purrr package for functional programming.

I recently used detect_index myself and would recommend it to anyone else with the same problem.

Documentation for detect_index can be found here: https://rdrr.io/cran/purrr/man/detect.html

Crudden answered 28/9, 2017 at 22:12 Comment(2)
Can you make an example?Orthopter
For example, purrr::detect_index(seq(1, 150, 3), function(x) x > 100). Hadley's packages are optimized for readability, but certainly not for speed.Flaming
K
3

As I need to perform a similar calculation many times within a loop, I was interested in which of the many answers provided in this thread would be most efficient.

TLDR: Whether the first value appears early or late in a vector, which.max(v > 100) is the fastest solution to this problem.

Note, however, that if no entry in v exceeds 100, it will return 1; thus there may be cause for

SafeWhichMax <- function (v) {
  first <- which.max(v > 100)
  if (first == 1L && v[1] <= 100) NA else first
}
SafeWhichMax(100) # NA
SafeWhichMax(101) # 1

If a vector is very long and is not guaranteed to contain any TRUE results, match(TRUE, v > 100) may be quicker than which.max() with checks.

# Short vector:
v <- 0:105

microbenchmark(
  which.max(v > 100),
  match(TRUE, v > 100),
  min(which(v > 100)),
  which(v > 100)[1],
  Position(function(x) x, v > 100),
  Position(function(x) x > 100, v),
  purrr::detect_index(v, function (x) x > 100)
)
Unit: microseconds
                                  mean      median
which.max(v > 100)                24.112    23.80
SafeWhichMax(v)                   24.889    24.25
match(TRUE, v > 100)              34.752    33.20
min(which(v > 100))               25.506    25.20
which(v > 100)[1]                 25.320    24.90
Position(function(x) x, v > 100)  3231.783  3043.50
Position(function(x) x > 100, v)  3487.805  3314.75
purrr::detect_index               16436.579 16064.90
# Long vector, with late first occurrence of v > 100
v <- -10000:105
Unit: microseconds
                                  mean   median
which.max(v > 100)               24.958    24.30
SafeWhichMax(v)                  25.456    24.90
match(TRUE, v > 100)             37.680    37.85
min(which(v > 100))              26.439    26.00
which(v > 100)[1]                25.724    25.55
Position(function(x) x, v > 100) 3224.240  3036.50
Position(function(x) x > 100, v) 3389.538  3287.05
purrr::detect_index              17344.706 15283.35
Karakalpak answered 9/2, 2022 at 16:7 Comment(0)
I
1

There are many solutions, another is:

x <- 90:110
which(x > 100)[1]
Immigration answered 1/4, 2015 at 10:39 Comment(0)
A
-2

Assuming values is your vector.

 firstGreatearThan <- NULL
  for(i in seq(along=values)) { 
    if(values[i] > 100) {
       firstGreatearThan <- i
       break
    }
 }
Angry answered 1/4, 2015 at 10:34 Comment(3)
I don't think that would give you the first value unless you added a breakImmigration
and we don't need a loop hereCivil
Yeah, right. The point is that is a so simple question, I just wrote a faster answerAngry

© 2022 - 2024 — McMap. All rights reserved.