Consider each sequence a set. Now we simply discard all subsets.
Given
import itertools as it
expected = {("A", "B", "C", ""), ("A", "", "", "D")}
data = [
("A", "B", "", ""),
("A", "B", "C", ""),
("", "", "", "D"),
("A", "", "", "D"),
("", "B", "", "")
]
Code
An iterative solution that converts and compares sets.
def discard_subsets(pool: list) -> set:
"""Return a set without subsets."""
discarded = set()
for n, k in it.product(pool, repeat=2): # 1
if set(k) < set(n)): # 2
discarded.add(k)
return set(pool) - discarded # 3
A similar one-line solution
set(data) - {k for n, k in it.product(data, repeat=2) if set(k) < set(n)}
Demo
discard_subsets(data)
# {('A', '', '', 'D'), ('A', 'B', 'C', '')}
Details
The latter function is annotated to help explain each part:
- Compare all elements with each other. (Or use nested loops).
- If an element is a proper subset (see below), discard it.
- Remove discarded elements from the pool.
Why use sets?
Each element of the pool can be a set since the pertinent sub-elements are unique, i.e. "A", "B", "C", "D", ""
.
Sets have membership properties. So saying, as an example,
("A", "B", "", "")
has all values in ("A", "B", "C", "")
can also be stated
the set {"A", "B", "", ""}
is a subset of {"A", "B", "C", ""}
All that remains is to compare all elements and reject all proper subsets.
a, a_, ac = {"a"}, {"a"}, {"a", "c"}
# Subsets
assert a.issubset(a_)
assert a <= a_
assert a <= ac
# Proper subsets
assert not a < a_
assert a < ac
Complexity
Since we basically have nested loops, at best we get O(n^2) complexity. It may not be the most efficient approach, but it should hopefully be clear enough to follow.
Tests
f = discard_subsets
assert {("A", "B", "C", "")} == f([("A", "B", "", ""), ("A", "B", "C", "")])
assert {("A", "B", "C", "")} == f([("A", "B", "C", ""), ("A", "B", "", "")])
assert {("A", "B", "C", ""), ("", "", "", "D")} == f([("A", "B", "", ""), ("A", "B", "C", ""), ("", "", "", "D")])
assert {("A", "B", "C", ""), ("", "", "", "D")} == f([("", "", "", "D"), ("A", "B", "", ""), ("A", "B", "C", "")])
assert {("A", "B", "C", ""), ("", "", "", "D")} == f([("A", "B", "C", ""), ("", "", "", "D"), ("A", "B", "", "")])
assert {("A", "B", "C", ""), ("", "", "", "D")} == f([("A", "B", "C", ""), ("A", "B", "", ""), ("", "", "", "D")])
assert {("A","","C"), ("","B","C"), ("A","B","")} == f([("A","","C"),("","B","C"),("","","C"),("A","",""),("","",""),("A","B",""),("","B","")])
assert set(expected) == f(data)