Space-efficient algorithm for finding the largest balanced subarray?
Asked Answered
S

10

33

given an array of 0s and 1s, find maximum subarray such that number of zeros and 1s are equal. This needs to be done in O(n) time and O(1) space.

I have an algo which does it in O(n) time and O(n) space. It uses a prefix sum array and exploits the fact that if the number of 0s and 1s are same then sumOfSubarray = lengthOfSubarray/2

#include<iostream>
#define M 15

using namespace std;

void getSum(int arr[],int prefixsum[],int size) {
    int i;
    prefixsum[0]=arr[0]=0;
    prefixsum[1]=arr[1];
    for (i=2;i<=size;i++) {
        prefixsum[i]=prefixsum[i-1]+arr[i];
    }
}

void find(int a[],int &start,int &end) {
    while(start < end) {
        int mid = (start +end )/2;
        if((end-start+1) == 2 * (a[end] - a[start-1]))
                break;
        if((end-start+1) > 2 * (a[end] - a[start-1])) {
            if(a[start]==0 && a[end]==1)
                    start++; else
                    end--;
        } else {
            if(a[start]==1 && a[end]==0)
                    start++; else
                    end--;
        }
    }
}

int main() {
    int size,arr[M],ps[M],start=1,end,width;
    ;
    cin>>size;
    arr[0]=0;
    end=size;
    for (int i=1;i<=size;i++)
            cin>>arr[i];
    getSum(arr,ps,size);
    find(ps,start,end);
    if(start!=end)
            cout<<(start-1)<<" "<<(end-1)<<endl; else cout<<"No soln\n";
    return 0;
}
Several answered 11/9, 2011 at 20:33 Comment(7)
I wonder whether it's easier to do this if you imagine the zeros have been replaced with minus one. That way the sum of the subarray is zero.Towne
Can we modify the original array?Bamby
that is an option to get O(1). but is there better way?Several
Since this is a homework can we assume that there is a O(1) storage solution?Intricacy
I doubt this is possible in O(1) space.Ringo
@templ, I suppose no, because then it wouldn't be O(1) space.. but the OP should respond to this.Ringo
The algorithm you have posted seems to be incorrect. Input 4 for the size and 1, 1, 1, 0 as the array data and you'll notice the answer comes out to be no solution. The reason for this is due to the lines if(a[start]==0 && a[end]==1) and if(a[start]==1 && a[end]==0). a here references the prefix sum array; I believe you intended for it to reference the original data. Indeed, the condition a[start]==1 && a[end]==0 is impossible; because start < end and no elements in the array are negative, the prefix sum until start cannot be greater than that until end.Fredi
Y
5

Now my algorithm is O(n) time and O(Dn) space where Dn is the total imblance in the list.

This solution doesn't modify the list.

let D be the difference of 1s and 0s found in the list.

First, let's step linearily through the list and calculate D, just to see how it works:

I'm gonna use this list as an example : l=1100111100001110

Element   D
null      0
1         1
1         2   <-
0         1
0         0
1         1
1         2
1         3
1         4
0         3
0         2
0         1
0         0
1         1
1         2
1         3
0         2   <-

Finding the longest balanced subarray is equivalent to finding 2 equal elements in D that are the more far appart. (in this example the 2 2s marked with arrows.)

The longest balanced subarray is between first occurence of element +1 and last occurence of element. (first arrow +1 and last arrow : 00111100001110)

Remark:

The longest subarray will always be between 2 elements of D that are between [0,Dn] where Dn is the last element of D. (Dn = 2 in the previous example) Dn is the total imbalance between 1s and 0s in the list. (or [Dn,0] if Dn is negative)

In this example it means that I don't need to "look" at 3s or 4s

Proof:

Let Dn > 0 .

If there is a subarray delimited by P (P > Dn). Since 0 < Dn < P, before reaching the first element of D which is equal to P we reach one element equal to Dn. Thus, since the last element of the list is equal to Dn, there is a longest subarray delimited by Dns than the one delimited by Ps.And therefore we don't need to look at Ps

P cannot be less than 0 for the same reasons

the proof is the same for Dn <0

Now let's work on D, D isn't random, the difference between 2 consecutive element is always 1 or -1. Ans there is an easy bijection between D and the initial list. Therefore I have 2 solutions for this problem:

  • the first one is to keep track of first and last appearance of each element in D that are between 0 and Dn (cf remark).
  • second is to transform the list into D, and then work on D.

FIRST SOLUTION

For the time being I cannot find a better approach than the first one:

First calculate Dn (in O(n)) . Dn=2

Second instead of creating D, create a dictionnary where the keys are the value of D (between [0 and Dn]) and the value of each keys is a couple (a,b) where a is the first occurence of the key and b the last.

Element   D DICTIONNARY
null      0 {0:(0,0)}
1         1 {0:(0,0) 1:(1,1)}
1         2 {0:(0,0) 1:(1,1) 2:(2,2)}
0         1 {0:(0,0) 1:(1,3) 2:(2,2)}
0         0 {0:(0,4) 1:(1,3) 2:(2,2)}
1         1 {0:(0,4) 1:(1,5) 2:(2,2)}
1         2 {0:(0,4) 1:(1,5) 2:(2,6)}
1         3 { 0:(0,4) 1:(1,5) 2:(2,6)}
1         4 {0:(0,4) 1:(1,5) 2:(2,6)}  
0         3{0:(0,4) 1:(1,5) 2:(2,6) }
0         2 {0:(0,4) 1:(1,5) 2:(2,9) }
0         1 {0:(0,4) 1:(1,10) 2:(2,9) } 
0         0 {0:(0,11) 1:(1,10) 2:(2,9) } 
1         1 {0:(0,11) 1:(1,12) 2:(2,9) } 
1         2 {0:(0,11) 1:(1,12) 2:(2,13)}
1         3 {0:(0,11) 1:(1,12) 2:(2,13)} 
0         2 {0:(0,11) 1:(1,12) 2:(2,15)} 

and you chose the element with the largest difference : 2:(2,15) and is l[3:15]=00111100001110 (with l=1100111100001110).

Time complexity :

2 passes, the first one to caclulate Dn, the second one to build the dictionnary. find the max in the dictionnary.

Total is O(n)

Space complexity:

the current element in D : O(1) the dictionnary O(Dn)

I don't take 3 and 4 in the dictionnary because of the remark

The complexity is O(n) time and O(Dn) space (in average case Dn << n).

I guess there is may be a better way than a dictionnary for this approach.

Any suggestion is welcome.

Hope it helps


SECOND SOLUTION (JUST AN IDEA NOT THE REAL SOLUTION)

The second way to proceed would be to transform your list into D. (since it's easy to go back from D to the list it's ok). (O(n) time and O(1) space, since I transform the list in place, even though it might not be a "valid" O(1) )

Then from D you need to find the 2 equal element that are the more far appart.

it looks like finding the longest cycle in a linked list, A modification of Richard Brent algorithm might return the longest cycle but I don't know how to do it, and it would take O(n) time and O(1) space.

Once you find the longest cycle, go back to the first list and print it.

This algorithm would take O(n) time and O(1) space complexity.

Yammer answered 13/9, 2011 at 19:27 Comment(9)
if you need to transform the list then it is NOT O(1) space.Ringo
@Ringo T. why so? if I modify it in place (that's what i'm doing)? Anyway, my second solution is just a idea, because I don't have an algorithm to find the longest cycle in a linked list yet. For the time being only the first one works, in O(n) time and O(M) space where M << n and is the maximum imbalance in the list.Yammer
@Ringo T. and honestly I doubt this is possible in O(1) space too.Yammer
I think modifying the input is not allowed, i.e. transforming the input list is not O(1) space.Ringo
I think modifying the input is allowed, but just within the space of the input. And as you replace a 1 bit entry (or take whatever constant you want here, 8 bit for chars, doenst matter) with (what is the biggest possible number ..., right n) with n times log n the required space is O(n log n).Cryptoclastic
@flolo,You're right, but in this case I cannot even use the index of any element of the list :D... Anyway (again) my second solution isn't complete and I know it, I just wrote it because it mghit guide someone to the right solution. But The first solution works, and takes O(n) time and O(M) space (or O(M*ln(M)) if I take your remark into account) (where M is the max imbalance and M << n in average).Yammer
@Ricky: Both presented algorithms require O(n) space. (1) The algorithm with the differences Dn requieres O(n) space because the differences have to be stored. (2) The algorithm with the map also requires O(n) space because the keys range from 0 to max imbalance M which is n/2 (or up to n in a simple minded impementation). I don't see why M << n. Also the entries for the imbalance 3 and 4 are missing in the example map.Lueck
@Jiri Nope, (1) I just need to keep in memory the CURRENT difference not the whole list. (2) I don't need to keep the 3 and 4 in memory because I know The solution won't start and end with 3 or 4 (cf the remark + proof) . My explanation might not be clear enough.Yammer
@Jiri I edited my answer hope it's clearer now. About M (or Dn) if you generate random lists of 0 and 1, in average case Dn will be much smaller than n . Dn=abs(2*sum - n) . In average case sum = n/2. I didn't make the whole calculation but Dn should be much smaller than n.Yammer
J
4

Different approach but still O(n) time and memory. Start with Neil's suggestion, treat 0 as -1.

Notation: A[0, …, N-1] - your array of size N, f(0)=0, f(x)=A[x-1]+f(x-1) - a function

If you'd plot f, you'll see, that what you look for are points for which f(m)=f(n), m=n-2k where k-positive natural. More precisely, only for x such that A[x]!=A[x+1] (and the last element in an array) you must check whether f(x) already occurred. Unfortunately, now I see no improvement over having array B[-N+1…N-1] where such information would be stored.

To complete my thought: B[x]=-1 initially, B[x]=p when p = min k: f(k)=x . And the algorithm is (double-check it, as I'm very tired):

fx = 0
B = new array[-N+1, …, N-1]
maxlen = 0
B[0]=0
for i=1…N-1 :
    fx = fx + A[i-1]
    if B[fx]==-1 :
        B[fx]=i
    else if ((i==N-1) or (A[i-1]!=A[i])) and (maxlen < i-B[fx]):
        We found that A[B[fx], …, i] is best than what we found so far
        maxlen = i-B[fx]

Edit: Two bed-thoughts (= figured out while laying in bed :P ):

1) You could binary search the result by the length of subarray, which would give O(n log n) time and O(1) memory algorithm. Let's use function g(x)=x - x mod 2 (because subarrays which sum to 0 are always of even length). Start by checking, if the whole array sums to 0. If yes -- we're done, otherwise continue. We now assume 0 as starting point (we know there's subarray of such length and "summing-to-zero property") and g(N-1) as ending point (we know there's no such subarray). Let's do

    a = 0
    b = g(N-1)
    while a<b : 
        c = g((a+b)/2)
        check if there is such subarray in O(n) time
        if yes:
            a = c
        if no:
            b = c
    return the result: a (length of maximum subarray)

Checking for subarray with "summing-to-zero property" of some given length L is simple:

    a = 0
    b = L
    fa = fb = 0
    for i=0…L-1:
        fb = fb + A[i]
    while (fa != fb) and (b<N) :
        fa = fa + A[a]
        fb = fb + A[b]
        a = a + 1
        b = b + 1
    if b==N:
        not found
    found, starts at a and stops at b

2) …can you modify input array? If yes and if O(1) memory means exactly, that you use no additional space (except for constant number of elements), then just store your prefix table values in your input array. No more space used (except for some variables) :D

And again, double check my algorithms as I'm veeery tired and could've done off-by-one errors.

Jo answered 11/9, 2011 at 23:32 Comment(0)
K
2

Like Neil, I find it useful to consider the alphabet {±1} instead of {0, 1}. Assume without loss of generality that there are at least as many +1s as -1s. The following algorithm, which uses O(sqrt(n log n)) bits and runs in time O(n), is due to "A.F."

Note: this solution does not cheat by assuming the input is modifiable and/or has wasted bits. As of this edit, this solution is the only one posted that is both O(n) time and o(n) space.

A easier version, which uses O(n) bits, streams the array of prefix sums and marks the first occurrence of each value. It then scans backward, considering for each height between 0 and sum(arr) the maximal subarray at that height. Some thought reveals that the optimum is among these (remember the assumption). In Python:

sum = 0
min_so_far = 0
max_so_far = 0
is_first = [True] * (1 + len(arr))
for i, x in enumerate(arr):
    sum += x
    if sum < min_so_far:
        min_so_far = sum
    elif sum > max_so_far:
        max_so_far = sum
    else:
        is_first[1 + i] = False

sum_i = 0
i = 0
while sum_i != sum:
    sum_i += arr[i]
    i += 1
sum_j = sum
j = len(arr)
longest = j - i
for h in xrange(sum - 1, -1, -1):
    while sum_i != h or not is_first[i]:
        i -= 1
        sum_i -= arr[i]
    while sum_j != h:
        j -= 1
        sum_j -= arr[j]
    longest = max(longest, j - i)

The trick to get the space down comes from noticing that we're scanning is_first sequentially, albeit in reverse order relative to its construction. Since the loop variables fit in O(log n) bits, we'll compute, instead of is_first, a checkpoint of the loop variables after each O(√(n log n)) steps. This is O(n/√(n log n)) = O(√(n/log n)) checkpoints, for a total of O(√(n log n)) bits. By restarting the loop from a checkpoint, we compute on demand each O(√(n log n))-bit section of is_first.

(P.S.: it may or may not be my fault that the problem statement asks for O(1) space. I sincerely apologize if it was I who pulled a Fermat and suggested that I had a solution to a problem much harder than I thought it was.)

Kirkman answered 14/9, 2011 at 0:55 Comment(0)
F
2

If indeed your algorithm is valid in all cases (see my comment to your question noting some corrections to it), notice that the prefix array is the only obstruction to your constant memory goal.

Examining the find function reveals that this array can be replaced with two integers, thereby eliminating the dependence on the length of the input and solving your problem. Consider the following:

  • You only depend on two values in the prefix array in the find function. These are a[start - 1] and a[end]. Yes, start and end change, but does this merit the array?
  • Look at the progression of your loop. At the end, start is incremented or end is decremented only by one.
  • Considering the previous statement, if you were to replace the value of a[start - 1] by an integer, how would you update its value? Put another way, for each transition in the loop that changes the value of start, what could you do to update the integer accordingly to reflect the new value of a[start - 1]?
  • Can this process can be repeated with a[end]?
  • If, in fact, the values of a[start - 1] and a[end] can be reflected with two integers, doesn't the whole prefix array no longer serve a purpose? Can't it therefore be removed?

With no need for the prefix array and all storage dependencies on the length of the input removed, your algorithm will use a constant amount of memory to achieve its goal, thereby making it O(n) time and O(1) space.

I would prefer you solve this yourself based on the insights above, as this is homework. Nevertheless, I have included a solution below for reference:

#include <iostream>
using namespace std;

void find( int *data, int &start, int &end )
{
    // reflects the prefix sum until start - 1
    int sumStart = 0;

    // reflects the prefix sum until end
    int sumEnd = 0;
    for( int i = start; i <= end; i++ )
        sumEnd += data[i];

    while( start < end )
    {
        int length = end - start + 1;
        int sum = 2 * ( sumEnd - sumStart );

        if( sum == length )
            break;
        else if( sum < length )
        {
            // sum needs to increase; get rid of the lower endpoint
            if( data[ start ] == 0 && data[ end ] == 1 )
            {
                // sumStart must be updated to reflect the new prefix sum
                sumStart += data[ start ];
                start++;
            }
            else
            {
                // sumEnd must be updated to reflect the new prefix sum
                sumEnd -= data[ end ];
                end--;
            }
        }
        else
        {
            // sum needs to decrease; get rid of the higher endpoint
            if( data[ start ] == 1 && data[ end ] == 0 )
            {
                // sumStart must be updated to reflect the new prefix sum
                sumStart += data[ start ];
                start++;
            }
            else
            {
                // sumEnd must be updated to reflect the new prefix sum
                sumEnd -= data[ end ];
                end--;
            }
        }
    }
}

int main() {
    int length;
    cin >> length;

    // get the data
    int data[length];
    for( int i = 0; i < length; i++ )
        cin >> data[i];

    // solve and print the solution
    int start = 0, end = length - 1;
    find( data, start, end );

    if( start == end )
        puts( "No soln" );
    else
        printf( "%d %d\n", start, end );

    return 0;
}
Fredi answered 19/9, 2011 at 8:42 Comment(0)
R
1

This algorithm is O(n) time and O(1) space. It may modify the source array, but it restores all the information back. So it is not working with const arrays. If this puzzle has several solutions, this algorithm picks the solution nearest to the array beginning. Or it might be modified to provide all solutions.

Algorithm

Variables:

  • p1 - subarray start
  • p2 - subarray end
  • d - difference of 1s and 0s in the subarray

    1. Calculate d, if d==0, stop. If d<0, invert the array and after balanced subarray is found invert it back.
    2. While d > 0 advance p2: if the array element is 1, just decrement both p2 and d. Otherwise p2 should pass subarray of the form 11*0, where * is some balanced subarray. To make backtracking possible, 11*0? is changed to 0?*00 (where ? is the value next to the subarray). Then d is decremented.
    3. Store p1 and p2.
    4. Backtrack p2: if the array element is 1, just increment p2. Otherwise we found element, changed on step 2. Revert the changes and pass subarray of the form 11*0.
    5. Advance p1: if the array element is 1, just increment p1. Otherwise p1 should pass subarray of the form 0*11.
    6. Store p1 and p2, if p2 - p1 improved.
    7. If p2 is at the end of the array, stop. Otherwise continue with step 4.

enter image description here

How does it work

Algorithm iterates through all possible positions of the balanced subarray in the input array. For each subarray position p1 and p2 are kept as far from each other as possible, providing locally longest subarray. Subarray with maximum length is chosen between all these subarrays.

To determine the next best position for p1, it is advanced to the first position where the balance between 1s and 0s is changed by one. (Step 5).

To determine the next best position for p2, it is advanced to the last position where the balance between 1s and 0s is changed by one. To make it possible, step 2 detects all such positions (starting from the array's end) and modifies the array in such a way, that it is possible to iterate through these positions with linear search. (Step 4).

While performing step 2, two possible conditions may be met. Simple one: when value '1' is found; pointer p2 is just advanced to the next value, no special treatment needed. But when value '0' is found, balance is going in wrong direction, it is necessary to pass through several bits until correct balance is found. All these bits are of no interest to the algorithm, stopping p2 there will give either a balanced subarray, which is too short, or a disbalanced subarray. As a result, p2 should pass subarray of the form 11*0 (from right to left, * means any balanced subarray). There is no chance to go the same way in other direction. But it is possible to temporary use some bits from the pattern 11*0 to allow backtracking. If we change first '1' to '0', second '1' to the value next to the rightmost '0', and clear the value next to the rightmost '0': 11*0? -> 0?*00, then we get the possibility to (first) notice the pattern on the way back, since it starts with '0', and (second) find the next good position for p2.

C++ code:

#include <cstddef>
#include <bitset>

static const size_t N = 270;

void findLargestBalanced(std::bitset<N>& a, size_t& p1s, size_t& p2s)
{
    // Step 1
    size_t p1 = 0;
    size_t p2 = N;
    int d = 2 * a.count() - N;
    bool flip = false;

    if (d == 0) {
        p1s = 0;
        p2s = N;
        return;
    }

    if (d < 0) {
        flip = true;
        d = -d;
        a.flip();
    }

    // Step 2
    bool next = true;
    while (d > 0) {
        if (p2 < N) {
            next = a[p2];
        }

        --d;
        --p2;

        if (a[p2] == false) {
            if (p2+1 < N) {
                a[p2+1] = false;
            }

            int dd = 2;
            while (dd > 0) {
                dd += (a[--p2]? -1: 1);
            }

            a[p2+1] = next;
            a[p2] = false;
        }
    }

    // Step 3
    p2s = p2;
    p1s = p1;

    do {
        // Step 4
        if (a[p2] == false) {
            a[p2++] = true;
            bool nextToRestore = a[p2];
            a[p2++] = true;

            int dd = 2;
            while (dd > 0 && p2 < N) {
                dd += (a[p2++]? 1: -1);
            }

            if (dd == 0) {
                a[--p2] = nextToRestore;
            }
        }
        else {
            ++p2;
        }

        // Step 5
        if (a[p1++] == false) {
            int dd = 2;
            while (dd > 0) {
                dd += (a[p1++]? -1: 1);
            }
        }

        // Step 6
        if (p2 - p1 > p2s - p1s) {
            p2s = p2;
            p1s = p1;
        }
    } while (p2 < N);

    if (flip) {
        a.flip();
    }
}
Randellrandene answered 29/11, 2011 at 17:12 Comment(2)
Did I took something wrong, or your code fails on the following examples? 1100111100001110 1000110100011 It ought to produce (2, 15) (3, 12) respectively. Also, I reread yours description probably a dozen times, but I can't get it. Would describe idea without such amount of algorithmic detail, just a main concept? Especially step 2 is messy. As I get it, you alter original array to create patterns 11*0 or 00*0, which denote some specific property of the array in between(*)? Doesn't look like good idea.Sulfathiazole
@wf34: Most likely you did not take into account that bitsets used in this code snippet use right-to-left indexing. Also it reports the result as a half-open interval, typical to C++. just a main concept? - part of this post, named "How does it work", is here exactly to explain a main concept. step 2 is messy - sorry about that, but if anything is not clear in the description, you could read the code.Randellrandene
B
0

Sum all elements in the array, then diff = (array.length - sum) will be the difference in number of 0s and 1s.

  1. If diff is equal to array.length/2, then the maximum subarray = array.
  2. If diff is less than array.length/2 then there are more 1s than 0s.
  3. If diff is greater than array.length/2 then there are more 0s than 1s.

For cases 2 & 3, initialize two pointers, start & end pointing to beginning and end of array. If we have more 1s, then move the pointers inward (start++ or end--) based on whether array[start] = 1 or array[end] = 1, and update sum accordingly. At each step check if sum = (end - start) / 2. If this condition is true, then start and end represent the bounds of your maximum subarray.

Here we end up doing two passes of the array, once to calculate sum, and once which moving the pointers inward. And we are using constant space as we just need to store sum and two index values.

If anyone wants to knock up some pseudocode, you're more than welcome :)

Bellyache answered 11/9, 2011 at 21:17 Comment(6)
Could you explain more precisely how to move pointers? I can't see how you decide which pointer to move. Maybe you could give an example for some array (001000111011000001000 or something). Actually, you made a greedy algorithm and I don't believe it works in general case.Jo
@Jo you should work to reduce the difference, if case (2) then test if either side is a 1, (if both, pick a side, say left) and increment that side, if neither side is a 1, pick a side (say, left) and increment that side; opposite behavior for case (3)Atalee
@Ricky: You need to pick a side consistently, say left, in your case 0011 and 1100 are both valid solutions, picking left will increment the left pointer to yield 1100, picking right would yield 0011.Atalee
It's more complicated than what Oceanic and Mark are suggesting. You could easily work out examples where it wont work.Parturient
@Mark : and with 00011111111100 how do I chose right or left ?Yammer
-1. 1) your solution doesn't work 2) in your 1st paragraph the formula for diff is apparently wrong, should be diff = 2*sum - length.Ringo
N
0

Here's an actionscript solution that looked like it was scaling O(n). Though it might be more like O(n log n). It definitely uses only O(1) memory.

Warning I haven't checked how complete it is. I could be missing some cases.

protected function findLongest(array:Array, start:int = 0, end:int = -1):int {
    if (end < start) {
        end = array.length-1;
    }

    var startDiff:int = 0;
    var endDiff:int = 0;
    var diff:int = 0;
    var length:int = end-start;
    for (var i:int = 0; i <= length; i++) {
        if (array[i+start] == '1') {
            startDiff++;
        } else {
            startDiff--;
        }

        if (array[end-i] == '1') {
            endDiff++;
        } else {
            endDiff--;
        }

        //We can stop when there's no chance of equalizing anymore.
        if (Math.abs(startDiff) > length - i) {
            diff = endDiff;
            start = end - i;
            break;
        } else if (Math.abs(endDiff) > length - i) {
            diff = startDiff;
            end = i+start;
            break;
        }
    }

    var bit:String = diff > 0 ? '1': '0';
    var diffAdjustment:int = diff > 0 ? -1: 1;

    //Strip off the bad vars off the ends.
    while (diff != 0 && array[start] == bit) {
        start++;
        diff += diffAdjustment;
    }

    while(diff != 0 && array[end] == bit) {
        end--;
        diff += diffAdjustment;
    }

    //If we have equalized end. Otherwise recurse within the sub-array.
    if (diff == 0)
        return end-start+1;
    else
        return findLongest(array, start, end);      

}
Nocuous answered 14/9, 2011 at 1:36 Comment(1)
Recursion needs stack - if the stack is not O(1) size, then you don't have O(1) space solution.Ringo
C
0

I would argue that it is impossible, that an algorithm with O(1) exists, in the following way. Assume you iterate ONCE over every bit. This requires a counter which needs the space of O(log n). Possibly one could argue that n itself is part of the problem instance, then you have as input length for a binary string of the length k: k + 2-log k. Regardless how you look over them you need an additional variable, on case you need an index into that array, that already makes it non O(1).

Usually you dont have this problem, because you have for an problem of the size n, an input of n numbers of the size log k, which adds up to nlog k. Here a variable of length log k is just O(1). But here our log k is just 1. So we can only introduce a help variable that has constant length (and I mean really constant, it must be limited regardless how big the n is).

Here one problem is the description of the problem comes visible. In computer theory you have to be very careful about your encoding. E.g. you can make NP problems polynomial if you switch to unary encoding (because then input size is exponential bigger than in a n-ary (n>1) encoding.

As for n the input has just the size 2-log n, one must be careful. When you speak in this case of O(n) - this is really an algorithm that is O(2^n) (This is no point we need to discuss about - because one can argue whether the n itself is part of the description or not).

Cryptoclastic answered 15/9, 2011 at 14:52 Comment(2)
While I understand what you're saying, if you assume that the machine is transdichotomous (as most machines are), then you can assume that each machine word can hold Omega(log n) bits. In that case, you can use O(1) machine words to hold the size of the problem.Bamby
@templatetypedef: I am not sure what you want to say with that. Its right you can use a RAM model that allows you to formulate the problem in O(1). But that says nothing.When you now want O(1) space, this is almost always fullfilled. I would say every of the already presented algorithm has the ability, that when you input a constant sized problem, the answer can be computed in constant (i.e. O(1)) space. Btw. if I think about it, its also not quite right that the input fits into O(1) machine words - e.g. n = 256 would imply 8 bit words, but you cant store 256 elements (bits) in one 8 bit word.Cryptoclastic
F
0

I have this algorithm running in O(n) time and O(1) space.

It makes use of simple "shrink-then-expand" trick. Comments in codes.

public static void longestSubArrayWithSameZerosAndOnes() {
    // You are given an array of 1's and 0's only.
    // Find the longest subarray which contains equal number of 1's and 0's
    int[] A = new int[] {1, 0, 1, 1, 1, 0, 0,0,1};
    int num0 = 0, num1 = 0;

    // First, calculate how many 0s and 1s in the array
    for(int i = 0; i < A.length; i++) {
        if(A[i] == 0) {
            num0++;
        }
        else {
            num1++;
        }
    }
    if(num0 == 0 || num1 == 0) {
        System.out.println("The length of the sub-array is 0");
        return;
    }

    // Second, check the array to find a continuous "block" that has
    // the same number of 0s and 1s, starting from the HEAD and the
    // TAIL of the array, and moving the 2 "pointer" (HEAD and TAIL)
    // towards the CENTER of the array
    int start = 0, end = A.length - 1;
    while(num0 != num1 && start < end) {
        if(num1 > num0) {
            if(A[start] == 1) {
                num1--; start++;
            }
            else if(A[end] == 1) {
                num1--; end--;
            }
            else {
                num0--; start++;
                num0--; end--;
            }
        }
        else if(num1 < num0) {
            if(A[start] == 0) {
                num0--; start++;
            }
            else if(A[end] == 0) {
                num0--; end--;
            }
            else {
                num1--; start++;
                num1--; end--;
            }
        }
    }
    if(num0 == 0 || num1 == 0) {
        start = end;
        end++;
    }

    // Third, expand the continuous "block" just found at step #2 by
    // moving "HEAD" to head of the array and "TAIL" to the end of
    // the array, while still keeping the "block" balanced(containing
    // the same number of 0s and 1s
    while(0 < start && end < A.length - 1) {
        if(A[start - 1] == 0 && A[end + 1] == 0 || A[start - 1] == 1 && A[end + 1] == 1) {
            break;
        }
        start--;
        end++;
    }
    System.out.println("The length of the sub-array is " + (end - start + 1) + ", starting from #" + start + " to #" + end);

}

Featherveined answered 28/5, 2015 at 12:37 Comment(0)
R
-1

linear time, constant space. Let me know if there is any bug I missed.
tested in python3.

def longestBalancedSubarray(A):
    lo,hi = 0,len(A)-1
    ones = sum(A);zeros = len(A) - ones
    while lo < hi:
        if ones == zeros: break
        else:
            if ones > zeros:
                if A[lo] == 1: lo+=1; ones-=1
                elif A[hi] == 1: hi+=1; ones-=1
                else: lo+=1; zeros -=1
            else:
                if A[lo] == 0: lo+=1; zeros-=1
                elif A[hi] == 0: hi+=1; zeros-=1
                else: lo+=1; ones -=1
    return(A[lo:hi+1])
Retrogradation answered 19/7, 2013 at 19:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.