I am trying to grasp the idea behind the prefix sum concept looking at the example presented in the Prefix Sum Lesson by Codility here (The mushroom picker problem)
My understanding is that the whole concept is based on the simple property where for finding a sum of all elements between two positions A(pos_left, pos_right) of an array A a second array P is used where all elements are consecutively summed and where the searched sum is calculated as
value(P(pos_right + 1)) - value(P(pos_left)).
A 1 2 3 4 5 6
P 0 1 3 6 10 15 21
sum of all elements between A[2] and A[5] = 3+ 4 + 5 = 12
or using the prefix sums" P[5+1] - P[2] = 15 -3 = 12
The problem
There is a street with mushroom at every place represented by a non-empty vector. Given the initial position of a picker and its movement range, possible maximum number of mushrooms to collect is looked for.
Looking at the example I don't understand the logic behind the constuction of the loops. Can anybody clarify the mechanics of this algorithm?
Secondly, I found the positoin indexing in this example very confusing and cumbersome. Is it common practise to "shift" the vector with prefix sums with the zero in the begining? (the fact that counting elements in vectors start by defualt from 0 in python causes already some confusion).
The solution
def prefix_sums(A):
n = len(A)
P = [0] * (n + 1)
for k in xrange(1, n + 1):
P[k] = P[k - 1] + A[k - 1]
return P
def count_total(P, x, y):
return P[y + 1] - P[x]
# A mushroom picker is at spot number k on the road and should perform m moves
def mushrooms(A, k, m):
n = len(A)
result = 0
pref = prefix_sums(A)
for p in xrange(min(m, k) + 1): # going left
left_pos = k - p
right_pos = min(n - 1, max(k, k + m - 2 * p))
result = max(result, count_total(pref, left_pos, right_pos))
for p in xrange(min(m + 1, n - k)):
right_pos = k + p
left_pos = max(0, min(k, k - (m - 2 * p)))
result = max(result, count_total(pref, left_pos, right_pos))
return result
I have run some example for a small array A= [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
, chose the position k=5 and the range m = 3. I don't understand the logic of creating the ranges to check by the two loops.
I get the following parameters for the loops
(p=, left_pos=, right_pos=)
loop 1 (0,5,8), (1,4,6),(2,3,5),(3,2,5)
loop 2 (0,2,5), (1,4,6), (2,5,7), (3,5,8)
The rangies vary. Why?
version for debugging
def mushrooms2(A, k, m):
n = len(A)
result = 0
pref = prefix_sums(A)
l1 =min(m, k) + 1
print 'loop p in xrange(min(m, k) + 1): %d' % l1
for p in xrange(min(m, k) + 1):
print 'p %d' % p
print 'A= %r' % A
print 'pref= %r' % pref
left_pos = k - p
right_pos = min(n - 1, max(k, k + m - 2 * p))
result = max(result, count_total(pref, left_pos, right_pos))
print 'left_pos = k - p= %d' % left_pos
print 'right_pos= min(n-1,max(k,k+m-2*p))= %d' % right_pos
print 'max'
print '(result %d' % result
print 'count_total(pref, left_pos, right_pos)) %r, %r, %r, %r' % (pref,left_pos, right_pos,count_total(pref, left_pos, right_pos))
print 'result= %d' % result
print 'next p'
l2=min(m + 1, n - k)
print 'loop xrange(min(m + 1, n - k)): %d' % l2
for p in xrange(min(m + 1, n - k)):
print 'p %d' % p
print 'A= %r' % A
print 'pref= %r' % pref
right_pos = k + p
left_pos = max(0, min(k, k - (m - 2 * p)))
result = max(result, count_total(pref, left_pos, right_pos))
print 'right_pos = k + p= %d' % right_pos
print 'left_pos = max(0, min(k, k - (m - 2 * p)))= %d' % left_pos
print 'max'
print '(result %d' % result
print 'count_total(pref, left_pos, right_pos)) %r, %r, %r, %r' % (pref,left_pos, right_pos,count_total(pref, left_pos, right_pos))
print 'result= %d' % result
print 'next p'
print 'result %d' % result
return result