Detailed explanation of how to come to the best solution step by step
Let's invent the solution.
Now, if we create pairs
that contain sums of pairs, for example,
arr = [10, 20, 30, 40]
pairs = [10+20, 10+30, 10+40, 20+30, 20+40, 30+40]
There is a pattern. We have three pairs for 10+x, two pairs for 20+x, one pair for 30+x, and zero pairs for 40+x.
[10+20, 10+30, 10+40, 20+30, 20+40, 30+40]
# ------------------- ------------ -----
[30, 40, 50, 50, 60, 70]
# ---------- ------ --
So, the total pairs are
3 + 2 + 1
= sum of first (n-1) natural numbers
= (n - 1) * (n - 1 + 1) / 2
= (n - 1) * n / 2
= (n^2 - n) / 2
It looks like the whole pairs
array will be sorted, but it is not true. Those sub-arrays in pairs
should be sorted because the initial arr
is sorted. For example,
arr = [10, 20, 30, 90]
pairs = [10+20, 10+30, 10+90, 20+30, 20+90, 30+90]
# Those sub-arrays are sorted
[30, 40, 100, 50, 110, 120]
# ----------- ------- ---
Now, let’s write the pairs
with origin arr
indices
pairs = [(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)]
(0, 1)
and (0, 2)
are not valid quadruples, because we are having 0
in both pairs. So, how can we logically find valid pairs?
We only have one valid pair for (0, 1)
which is (2, 3)
which does not have 0
or 1
[(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)]
# x x x x x x ----
One fact is, we can always write quadruple in such a way that one pair is next to the other pair. For example,
x = 100
arr = [10, 20, 30, 40]
pairs = [30, 40, 50, 50, 60, 70]
[10, 20, 30, 40]
# -- ------ --
quadruple = (10 + 40) + (20 + 30)
# Which can we rewrite as
[10, 20, 30, 40]
# ------ ------
quadruple = (10 + 20) + (30 + 40) = 30 + 70
# Which is as follows
pairs = [30, 40, 50, 50, 60, 70]
# -- --
So, we can do as follows to solve the problem
for pair0 in pairs:
valid_pairs_for_pair0 = # Somehow get the valid pairs
for pair1 in valid_pairs_for_pair0:
if pair0 + pair1 == x:
ans += 1
But the above solution is O(n^4)
, because pairs
is of length (n^2 - n) / 2
.
We can do better as we know those sub-arrays in the pairs are sorted.
arr = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] # n = 10
pairs = [
(0,1),(0,2),(0,3),(0,4),(0,5),(0,6),(0,7),(0,8),(0,9),# (0,x) -> 9 pairs -> 10 - 0 - 1
(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),(1,8),(1,9),# (1,x) -> 8 pairs -> 10 - 1 - 1
(2,3),(2,4),(2,5),(2,6),(2,7),(2,8),(2,9),# (2,x) -> 7 pairs -> 10 - 2 - 1
(3,4),(3,5),(3,6),(3,7),(3,8),(3,9),# (3,x) -> 6 pairs -> 10 - 3 - 1
(4,5),(4,6),(4,7),(4,8),(4,9),# (4,x) -> 5 pairs -> 10 - 4 - 1
(5,6),(5,7),(5,8),(5,9),# (5,x) -> 4 pairs -> 10 - 5 - 1
(6,7),(6,8),(6,9),# (6,x) -> 3 pairs -> 10 - 6 - 1
(7,8),(7,9),# (7,x) -> 2 pairs -> 10 - 7 - 1
(8,9),# (8,x) -> 1 pair -> 10 - 8 - 1
]
# We need to find the first valid pair and all of the pairs after that will be valid.
first valid pair index for (0, 1) => first (2,x) pair => (2,3) => pairs[9 + 8]
first valid pair index for (0, 2) => first (3,x) pair => (3,4) => pairs[9 + 8 + 7]
first valid pair index for (0, 3) => first (4,x) pair => (4,5) => pairs[9 + 8 + 7 + 6]
# There is a pattern
pairs[9 + 8] => pairs[sum(9 to 1) - sum(7 to 1)]
pairs[9 + 8 + 7] => pairs[sum(9 to 1) - sum(6 to 1)]
pairs[9 + 8 + 7 + 6] => pairs[sum(9 to 1) - sum(5 to 1)]
# That’s how we get started and for binary search
start = firstNSum(n - 1) - firstNSum(n - i1 - 2)
end = start + n - (i1 + 1) - 1 # n - (i1 + 1) - 1 is the number of pairs for (i1,x) pairs
Now, we can solve the problem as follows.
# For pair0 in pairs:
# Binary search for all valid sub-arrays of pairs for pair0
Solution 1: Binary search
Time complexity: O(n^3.log(n))
log(n) + log(n-1) ... log(1) = log(n!) = n.log(n)
Space complexity: O(n^2)
def firstNSum(n):
return n * (n + 1) // 2
def binary_search(pairs, x, start, end):
while start < end:
mid = (start + end) // 2
if pairs[mid][1] < x:
start = mid + 1
else:
end = mid
return start
def count_four_pairs_with_sum(arr, x):
n = len(arr)
ans = 0
pairs = []
for i0 in range(n - 1):
for i1 in range(i0 + 1, n):
curr_sum = arr[i0] + arr[i1]
pairs.append([(i0, i1), curr_sum])
for [(i0, i1), curr_sum] in pairs:
start = firstNSum(n - 1) - firstNSum(n - i1 - 2)
end = start + n - (i1 + 1) - 1
while start < len(pairs):
x_start = binary_search(pairs, x - curr_sum, start, end)
x_end = binary_search(pairs, x - curr_sum + 1, start, end)
ans += x_end - x_start
i1 += 1
start += n - i1 - 1
end = start + n - (i1 + 1) - 1
return ans
arr = [10, 20, 30, 40]
x = 100
print(count_four_pairs_with_sum(arr, x))
We can do better. If we store the number of pairs with sum alongside with that also storing how many pairs are from each (i, x) pair group from pairs
# Loop for i0
# Loop for i1
# ans += valid pairs for i0 and i1, which is sum of i1 to n excluding i0 to i1
Solution 2: Using hashmap
Time complexity: O(n^3)
Space complexity: O(n^3)
from collections import defaultdict
def count_four_pairs_with_sum(arr, x):
n = len(arr)
ans = 0
sum_freq = defaultdict(lambda: defaultdict(int))
for i0 in range(n - 1):
for i1 in range(i0 + 1, n):
curr_sum = arr[i0] + arr[i1]
sum_freq[curr_sum][i0] += 1
for i0 in range(n - 1):
for i1 in range(i0 + 1, n):
curr_sum = arr[i0] + arr[i1]
needed_sum = x - curr_sum
valid_needed_sum_count = sum([sum_freq[needed_sum][i] for i in range(i1+1, n)])
ans += valid_needed_sum_count
return ans
arr = [0, 0, 0, 0, 0]
x = 0
print(count_four_pairs_with_sum(arr, x))
We can do better (as this answer showed) if we have frequencies of all possible pairs, and we look for all valid pair1
for each pair0
.
Let a + b + c + d = x
a
can be any number on the left of b
c
and d
can be any pair right of b
Because we know, we can rewrite any quadruple in such a way that a < b < c < d
, for example,
[0, 1, 2, 3, 4, 5, ...., n-1, n]
# a b c d
So, for any b
we only need to count the valid (c,d)
to the right of it, which means we don't need to consider any pair containing any number that is left to b
for example (c,d)=(2,5)
is invalid if b=4
because 2
is left of 4
Now, we can solve that as follows.
# For every b
# Remove all pairs for b
# For every valid a, a < b
# ans += number of valid pairs in remaining pairs
The first loop for b
will keep removing pairs for current b
that means when b=4
we already removed all pairs from previous values of b=1,2,3
Final solution: Using hashmap
Time complexity: O(n^2)
Space complexity: O(n^2)
from collections import defaultdict
def count_four_pairs_with_sum(arr, x):
n = len(arr)
sum_freq = defaultdict(int)
for i0 in range(n - 1):
for i1 in range(i0 + 1, n):
curr_sum = arr[i0] + arr[i1]
sum_freq[curr_sum] += 1
ans = 0
for i, b in enumerate(arr):
for j in arr[i+1:]:
sum_freq[b+j] -= 1
for a in arr[:i]:
c_plus_d = x - (a+b)
ans += sum_freq[c_plus_d]
return ans
arr = [0, 0, 0, 0, 0]
x = 0
print(count_four_pairs_with_sum(arr, x))
X - current pair sum
. The result should be the sum of the other two numbers. Test, whether those are in the hashtable. Average complexity is only slightly above O(n²). The hashtable should be able to store multiple entries (possible pairs) for each number (sum). Otherwise create a short linked list for each entry. – Arbitrary