R overlap multiple GRanges with findOverlaps()
Asked Answered
T

2

5

I have three tables with differing genomic intervals. Here is an example:

> a
   chr interval.start interval.end names
1 chr1              5           10     a
2 chr1              6           10     b
3 chr2              7           10     c
4 chr3              8           10     d

> b
   chr interval.start interval.end names
1 chr1              6           15     e
2 chr1              7           15     f
3 chr1              8           15     g

> c
   chr interval.start interval.end names
1 chr1              7           12     h
2 chr1              8           12     i
3 chr5              9           12     j
4 chr10             10          12     k
5 chr20             11          12     l

I am trying to find the common intervals between all tables after converting info to GRanges. Essentially I want to do something like intersect(c,intersect(a,b)). However, because I am using genomic coordinates, I have to do this with GRanges and GenomicRanges package, which I am not familiar with.

I can do findOverlaps(gr, gr1) or findOverlaps(gr1, gr2), but is there an easy way to overlap multiple GRanges at once like findOverlaps(gr, gr1, gr2)?

Any help would be appreciated. If this question was asked elsewhere, I apologize in advance.

Thanks

Thorvaldsen answered 28/4, 2014 at 2:12 Comment(0)
C
11

You can subset one of them using the subsetByOverlaps result of one pairwise comparison then use that subset to compare to the third set.

Sub1 <- subsetByOverlaps(gr,gr1)
Sub2 <- subsetByOverlaps(sub1,gr2)

Or directly

Reduce(subsetByOverlaps, list(gr, gr1, gr2))

resulting in the subset of the GRanges object that overlap in all 3 GRanges objects

Depending on the type of overlap you want and which has the largest ranges, you should consider which to use as the query and which the subject.

Colver answered 28/4, 2014 at 2:30 Comment(0)
W
1

Following works for getting the exact intersects between all the ranges.

Reduce(intersect, list(gr, gr1, gr2))

In:

Reduce(subsetByOverlaps, list(gr, gr1, gr2))

subsetByOverlaps takes the first granges object as the query (first object in parentheses, here gr) and returns the coordiantes in the query (gr) that overlaps with at least one element in the subjects (gr1, gr2). So to find common intervals (regions of intersection), intersect is a the appropriate function.

Wearproof answered 16/3, 2016 at 17:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.