If I have a variable number of sets (let's call the number n), which have at most m elements each, what's the most efficient way to calculate the pairwise intersections for all pairs of sets? Note that this is different from the intersection of all n sets.
For example, if I have the following sets:
A={"a","b","c"}
B={"c","d","e"}
C={"a","c","e"}
I want to be able to find:
intersect_AB={"c"}
intersect_BC={"c", "e"}
intersect_AC={"a", "c"}
Another acceptable format (if it makes things easier) would be a map of items in a given set to the sets that contain that same item. For example:
intersections_C={"a": {"A", "C"},
"c": {"A", "B", "C"}
"e": {"B", "C"}}
I know that one way to do so would be to create a dictionary mapping each value in the union of all n sets to a list of the sets in which it occurs and then iterate through all of those values to create lists such as intersections_C
above, but I'm not sure how that scales as n increases and the sizes of the set become too large.
Some additional background information:
- Each of the sets are of roughly the same length, but are also very large (large enough that storing them all in memory is a realistic concern, and an algorithm which avoids that would be preferred though is not necessary)
- The size of the intersections between any two sets is very small compared to the size of the sets themselves
- If it helps, we can assume anything we need to about the ordering of the input sets.
intersections_C
in my post? The only information that I care about at the end are the intersections of the sets. Unfortunately, I don't know too much about what my constraints are because I haven't been able to test this to scale just yet. Thanks for your help! – Amickintersections_C
dictionary contains only the elements that belong in set C. I suggested one dictionary for the elements contained in the union of all your sets. – Feltingintersections_C
. – Amickn
in practice? 2. If you were to take all of your sets and partition them randomly into two groupsA
andB
, could(⋃A) ⋂ (⋃B)
be large? In Python syntax, that'sset().union(*A) & set().union(*B)
. – Nevillenevin