Finding Sequences [gap or difference] between two vectors
Asked Answered
E

2

5

Consider I have two vectors

a <- c(1,3,5,7,9, 23,35,36,43)
b <- c(2,4,6,8,10,24, 37, 45)

Please notice the length of both are different.

I want to find the gap/difference/sequence between two vectors to match based on closest proximity.

Expected Output

a     b
1     2
3     4
5     6
7     8
9     10
23    24
35    NA
36    37
43    45

Please notice that 35 has NA against it because 36 has a sequence matching/closest proximity with 37.

Electrophoresis answered 10/4, 2018 at 17:33 Comment(0)
V
5

You can using findInterval

df=data.frame(a)
df$b[findInterval(b, a)]=b
df
   a  b
1  1  2
2  3  4
3  5  6
4  7  8
5  9 10
6 23 24
7 35 NA
8 36 37
9 43 45
Vickievicksburg answered 10/4, 2018 at 17:59 Comment(3)
Thank you for your answer! I think this scales well.Electrophoresis
What the...first time seeing findInterval after years of using R. Time to read every base R function one of these weekends.Campfire
@SarwatAshraf yw~ :-) happy coding , if this what you need , can you consider accept ? (check mark at the left )Vickievicksburg
C
1

This algorithm can only deal with one NA. For N possible NA's, you just have to try all combination(length(b), N) possibilities. Tries to find min(abs(a-b)) for every possible NA insertion slot.

  # Try insertion
  Map(f = function(i) mean(abs(append(b, NA, i) - a), na.rm = T),
      i = 1:length(b)) %>%
  # Find index of the best insertion spot
  which.min %>%
  # Actually insert
  {append(b, NA, .)} %>%
  # Display data
  {cbind(a, b = .)}

       a  b
 [1,]  1  2
 [2,]  3  4
 [3,]  5  6
 [4,]  7  8
 [5,]  9 10
 [6,] 23 24
 [7,] 35 NA
 [8,] 36 37
 [9,] 43 45
Campfire answered 10/4, 2018 at 17:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.