I'm looking for something similar to bedtools subtract but with dataframes.
For example, say I have the range as a dataframe here:
Start End Value
0 100 P
And I have another dataframe, which is sorted:
Start End Value
10 25 A
50 63 B
Would there be a way to fill this like so:
Start End Value
0 9 P1
10 25 A
26 49 P2
50 63 B
64 100 P3
P1, P2 and P3 labels which are filled in to pad the 2nd dataframe so that the entire range of value gets covered.
I tried using Dplyr's Lag function and adding the padding values manually, but given that the range can change depending on the length of genomic feature (including the start and end co-ordinates), I wanted this range filling to be automatic.
Thank you!
For example, this is a small subset of the data:
data_range<- data.frame(start=0, end=100, value="P")
tofill_range<- data.frame(start=c(15, 51, 70),end = c(39, 62, 79), value = c("A","B","C"))