Kth maximum sum of a contiguous subarray of +ve integers in O(nlogS)

Asked 27/1, 2016 at 8:59 Answered 15/10, 2020 at 19:18

I was reading this editorial and got confused with this statement:

If the array elements are all non-negative, we can use binary search to find the answer in O(n log S) time, where S is the maximum sum of a subarray."

Can anyone explain the above statement.

Calzada answered 27/1, 2016 at 8:59 Comment(1)

Good question. The statement is far from obvious. Apparently the solution is supposed to be doing a binary over the answer. However I am still trying to think of a way to count the bigger sums than a given value X in linear time. – Shawnshawna 27/1, 2016 at 9:54

Assume that we have an array sum, which at index ith store the sum of all element from 0 to ith, so, if all element are non-negative, so

 sum[0] <= sum[1] <= sum[2] ... <= sum[i] ... <= sum[n - 1]

We notice that, the sum of a sub array (i, j) of array A is sum[j] - sum[i - 1]

So, Given a number X, we can easily calculate the rank of this number from all sum of sub array of A as follow:

int rank = 0;
for(int i = 0; i < n; i++){
    int index = minimum index which sum[i] - sum[index] >= X;
    //As sum[0] <= sum[1] <=... , we can use binary search to find index
    rank += index;
}

Finally, to find which number is the Kth number, we can use binary search in range O to S and use the above algorithm to calculate the rank, with S is the maximum sum of a subarray.

int start = 0;
int end = S;
while(start <= end){
   int mid = (start + end) >> 1;
   int rank = calRank(mid , sum)
   if(rank < mid)
      end = mid - 1;
   else if(rank > mid)
      start = mid + 1;
   else
      break;
}

So, time complexity is O(nlogS log n).

Hyetograph answered 27/1, 2016 at 12:23 Comment(2)

Great answer but, Isn't the complexity of this solution O(nlognlogS)...? – Calzada 27/1, 2016 at 23:57

@GauravArora May be, you can use a "2-pointer" technique for reducing the time complexity to an amortized $O(1)$, which makes the time complexity $O(nlogS)$? – Storybook 15/11, 2018 at 12:44

None of the existing answers are correct, so here is a correct approach.

First of all as @PhamTrung pointed out, we can in O(n) generate the cumulative sums of the array, and by subtracting two cumulative sums we can calculate the cumulative sum of any contiguous subarray in O(1). At that point our answer is bounded between 0 and the sum S of everything.

Next, we know how many contiguous subarrays there are. Just choose the endpoints, there are n*(n-1)/2) such pairs.

The heart of the problem is that given X, we need to in O(n), count how many pairs are less than, m. To do that we use a pair of pointers, i and j. We run them up in parallel, keeping the sum from i to j below X but keeping them as far apart as possible given that. And then we keep adding how many pairs there were between them that would also be below X. In pseudocode that looks like this:

count_below = 0
i = 0
j = -1
while i < n:
    while j+1 < n or sum(from i to j+1) < X:
        count_below += 1 # Add the (i, j+1) interval
        j += 1
    if j+1 == n:
        count_below += (n-i-1) * (n-i-2) / 2 # Add all pairs with larger i
        i = n
    else:
        while X <= sum(from i+1 to j+1):
            i += 1
            count_below += j - i # Add a block of (i, ?) pairs

I can't swear that I got the indexing right, but that's the idea. The tricky bit is that every time we advance j we only add one, but every time we advance i we include every (i, mid) with i < mid <= j.

And now we do binary search on the value.

lower = 0
upper = S
while lower < upper:
   mid = floor((upper + lower)/2)
   if count below mid < count_intervals - k:
       lower = mid+1
   else:
       upper = mid

Assuming that the sums are integers, this will find the correct answer in O(log(S)) searches. Each of which is O(n). For a total time of O(n log(S)).

Note that if we're clever about the binary searching and keep track of both the count and the closest two sums, we can improve the time to O(n log(min(n, S))) by dropping upper to the max sum <= mid and raising lower to the next higher sum. Drop the floor and that approach also will work with floating point numbers to produce an answer in O(n log(n)). (With S being effectively infinite.)

Litta answered 15/10, 2020 at 19:18 Comment(0)

I think it can be done in O(Klogn). sum[i] is defined as a prefix sum of the given array up to index i. It can sometimes be faster than the previous solution.

Construct a max queue Q containing pairs of indices (i, j). The order is (a, b) < (c, d) iff sum[b] - sum[a - 1] < sum[d] - sum[d - 1] else if b - a < d - c else if b < d.
Insert the following pairs into Q: (0, 0), (0, 1) ... (0, N - 1)
Iterate K - 1 times and in every iteration pop the top element e = (i, j) off the Q and if i < j then insert (i + 1, j).
Your element is on the top of the queue.

Tynan answered 28/1, 2016 at 11:52 Comment(0)

Recommended topics

Hot tags