Multiple intersection of lists
Asked Answered
S

1

3

I have 4 lists

a <- list(1,2,3,4)
b <- list(5,6,7,8)
c <- list(7,9,0)
d <- list(12,14)

I would like to know which of the lists have elements in common. In this example, lists b and c have the element 7 in common.

A brute force approach would be to take every combination of lists and find the intersection. Is there any other efficient way to do it in R?

Another approach would be to make a single list from all the lists and find the duplicates. Then maybe we could have a mapping function to indicate from which original lists these duplicates are from. But am not so sure about how to do it. I came across this post

Find indices of duplicated rows

I was thinking if we could modify this to find out the actual lists which have duplicates.

I have to repeat this process for many groups of lists. Any suggestions/ideas are greatly appreciated! Thanks in advance

Silvey answered 22/5, 2015 at 22:9 Comment(2)
Are you only interested to check if there is/are values in commen or do you also want to know which values are in commen?Angeli
@Angeli Just want to know if there are commonSilvey
A
8

What about using this double sapply?

l <- list(a,b,c,d)

sapply(seq_len(length(l)), function(x) 
  sapply(seq_len(length(l)), function(y) length(intersect(unlist(l[x]), unlist(l[y])))))
     [,1] [,2] [,3] [,4]
[1,]    4    0    0    0
[2,]    0    4    1    0
[3,]    0    1    3    0
[4,]    0    0    0    2

Interpretation: e.g. the element [1,2] of the matrix shows you how many elements the first element of the list l (in this case the sublist a) has in commom with the second list element (i.e. the sublist b)

Or alternatively just to see the indices of the sublists which have a common value with some other sublist:

which(sapply(seq_len(length(l)), function(x) length(intersect(l[[x]], unlist(l[-x])))) >= 1)
[1] 2 3
Angeli answered 22/5, 2015 at 22:52 Comment(13)
Thanks for the idea. I have a query - If d <- list(8,14), then lists b, c, d have elements in common. I would like to get the output as lists b,c,d or 1,2,3. So should I search the matrix and concatinate?Silvey
Look at the alternativeAngeli
if d <- list(8,14), then the second alternative gives only 3,4 instead of 2,3,4.Silvey
@Silvey Switch to >=1 and you get 2,3,4. It sounds like you're describing a connected component of a graph. Maybe a specialized tool like the igraph package would serve you better. mathworld.wolfram.com/ConnectedComponent.htmlPontine
Thanks @Pontine and DatamineR, I have one more concern. If d <- list(1,14), then I need to know that lists a and d have common and c and d have common elements. I am interested to know which groups of lists have elements in common. Let me know if I am not clear.Silvey
You can read this from the first solution. In which form do you want to have the final result?Angeli
Each list containing the names of lists which are in a group. From @Frank's link, I realised that I am indeed finding the connected components.Silvey
You could save the resul of the first alternative as res and the run diag(res) <- 0; apply(res,1, function(x) which(x!=0))Angeli
@Angeli That finds neighbors, but not neighbors' neighbors, etc. (which all belong in a single connected component). I think it can be left to the OP for a separate question or a review of graph theory lit.Pontine
@Frank, DatamineR - Thanks a lot for your help. I have posted another question regarding the connected components here - #30408269Silvey
Any extension to three and four way intersections?Leviathan
Could be simplified as: sapply(l, function(x) sapply(l, function(y) length(intersect(x,y))))Cercaria
@Cercaria your solution gives me nice table with all my list names. But it would be really helpful if you can tell me how do i get the elements which are intersected between two sets.?Downpour

© 2022 - 2024 — McMap. All rights reserved.