Codility Peaks Complexity
Asked Answered
S

11

12

I've just done the following Codility Peaks problem. The problem is as follows:


A non-empty zero-indexed array A consisting of N integers is given. A peak is an array element which is larger than its neighbors. More precisely, it is an index P such that 0 < P < N − 1, A[P − 1] < A[P] and A[P] > A[P + 1]. For example, the following array A:

A[0] = 1
A[1] = 2
A[2] = 3
A[3] = 4
A[4] = 3
A[5] = 4
A[6] = 1
A[7] = 2
A[8] = 3
A[9] = 4
A[10] = 6
A[11] = 2

has exactly three peaks: 3, 5, 10. We want to divide this array into blocks containing the same number of elements. More precisely, we want to choose a number K that will yield the following blocks: A[0], A[1], ..., A[K − 1], A[K], A[K + 1], ..., A[2K − 1], ... A[N − K], A[N − K + 1], ..., A[N − 1]. What's more, every block should contain at least one peak. Notice that extreme elements of the blocks (for example A[K − 1] or A[K]) can also be peaks, but only if they have both neighbors (including one in an adjacent blocks). The goal is to find the maximum number of blocks into which the array A can be divided. Array A can be divided into blocks as follows:

one block (1, 2, 3, 4, 3, 4, 1, 2, 3, 4, 6, 2). This block contains three peaks.

two blocks (1, 2, 3, 4, 3, 4) and (1, 2, 3, 4, 6, 2). Every block has a peak.

three blocks (1, 2, 3, 4), (3, 4, 1, 2), (3, 4, 6, 2). Every block has a peak.

Notice in particular that the first block (1, 2, 3, 4) has a peak at A[3], because A[2] < A[3] > A[4], even though A[4] is in the adjacent block. However, array A cannot be divided into four blocks, (1, 2, 3), (4, 3, 4), (1, 2, 3) and (4, 6, 2), because the (1, 2, 3) blocks do not contain a peak. Notice in particular that the (4, 3, 4) block contains two peaks: A[3] and A[5]. The maximum number of blocks that array A can be divided into is three.

Write a function: class Solution { public int solution(int[] A); } that, given a non-empty zero-indexed array A consisting of N integers, returns the maximum number of blocks into which A can be divided. If A cannot be divided into some number of blocks, the function should return 0. For example, given:

A[0] = 1
A[1] = 2 
A[2] = 3 
A[3] = 4 
A[4] = 3 
A[5] = 4 
A[6] = 1 
A[7] = 2 
A[8] = 3 
A[9] = 4 
A[10] = 6 
A[11] = 2

the function should return 3, as explained above. Assume that:

N is an integer within the range [1..100,000]; each element of array A is an integer within the range [0..1,000,000,000].

Complexity:

expected worst-case time complexity is O(N*log(log(N)))

expected worst-case space complexity is O(N), beyond input storage (not counting the storage required for input arguments).

Elements of input arrays can be modified.


My Question

So I solve this with what to me appears to be the brute force solution – go through every group size from 1..N, and check whether every group has at least one peak. The first 15 minutes I was trying to solve this I was trying to figure out some more optimal way, since the required complexity is O(N*log(log(N))).

This is my "brute-force" code that passes all the tests, including the large ones, for a score of 100/100:

public int solution(int[] A) {
    int N = A.length;

    ArrayList<Integer> peaks = new ArrayList<Integer>();
    for(int i = 1; i < N-1; i++){
        if(A[i] > A[i-1] && A[i] > A[i+1]) peaks.add(i);
    }

    for(int size = 1; size <= N; size++){
        if(N % size != 0) continue;
        int find = 0;
        int groups = N/size;
        boolean ok = true;
        for(int peakIdx : peaks){
            if(peakIdx/size > find){
                ok = false;
                break;
            }
            if(peakIdx/size == find) find++;
        }
        if(find != groups) ok = false;
        if(ok) return groups;
    }

    return 0;
}

My question is how do I deduce that this is in fact O(N*log(log(N))), as it's not at all obvious to me, and I was surprised I pass the test cases. I'm looking for even the simplest complexity proof sketch that would convince me of this runtime. I would assume that a log(log(N)) factor means some kind of reduction of a problem by a square root on each iteration, but I have no idea how this applies to my problem. Thanks a lot for any help

Semiprofessional answered 2/1, 2014 at 15:47 Comment(1)
Dividing peakIdx by size is not so smart because division of binary numbers involves numerous add + shift operations. It's much faster to check if the peak index lies within the boundary indices of the current block as this uses 2 subtraction and 1 logical comparison operations.Grandee
L
9

You're completely right: to get the log log performance the problem needs to be reduced.

A n.log(log(n)) solution in python [below]. Codility no longer test 'performance' on this problem (!) but the python solution scores 100% for accuracy.

As you've already surmised: Outer loop will be O(n) since it is testing whether each size of block is a clean divisor Inner loop must be O(log(log(n))) to give O(n log(log(n))) overall.

We can get good inner loop performance because we only need to perform d(n), the number of divisors of n. We can store a prefix sum of peaks-so-far, which uses the O(n) space allowed by the problem specification. Checking whether a peak has occurred in each 'group' is then an O(1) lookup operation using the group start and end indices.

Following this logic, when the candidate block size is 3 the loop needs to perform n / 3 peak checks. The complexity becomes a sum: n/a + n/b + ... + n/n where the denominators (a, b, ...) are the factors of n.

Short story: The complexity of n.d(n) operations is O(n.log(log(n))).

Longer version: If you've been doing the Codility Lessons you'll remember from the Lesson 8: Prime and composite numbers that the sum of harmonic number operations will give O(log(n)) complexity. We've got a reduced set, because we're only looking at factor denominators. Lesson 9: Sieve of Eratosthenes shows how the sum of reciprocals of primes is O(log(log(n))) and claims that 'the proof is non-trivial'. In this case Wikipedia tells us that the sum of divisors sigma(n) has an upper bound (see Robin's inequality, about half way down the page).

Does that completely answer your question? Suggestions on how to improve my python code are also very welcome!

def solution(data):

    length = len(data)

    # array ends can't be peaks, len < 3 must return 0    
    if len < 3:
        return 0

    peaks = [0] * length

    # compute a list of 'peaks to the left' in O(n) time
    for index in range(2, length):
        peaks[index] = peaks[index - 1]

        # check if there was a peak to the left, add it to the count
        if data[index - 1] > data[index - 2] and data[index - 1] > data[index]:
            peaks[index] += 1

    # candidate is the block size we're going to test
    for candidate in range(3, length + 1):

        # skip if not a factor
        if length % candidate != 0:
            continue

        # test at each point n / block
        valid = True
        index = candidate
        while index != length:

            # if no peak in this block, break
            if peaks[index] == peaks[index - candidate]:
                valid = False
                break

            index += candidate

        # one additional check since peaks[length] is outside of array    
        if index == length and peaks[index - 1] == peaks[index - candidate]:
            valid = False

        if valid:
            return length / candidate

    return 0

Credits: Major kudos to @tmyklebu for his SO answer which helped me a lot.

Luca answered 6/10, 2014 at 11:33 Comment(0)
N
0

I'm don't think that the time complexity of your algorithm is O(Nlog(logN)).

However, it is certainly much lesser than O(N^2). This is because your inner loop is entered only k times where k is the number of factors of N. The number of factors of an integer can be seen in this link: http://www.cut-the-knot.org/blue/NumberOfFactors.shtml

I may be inaccurate but from the link it seems,

k ~ logN * logN * logN ...

Also, the inner loop has a complexity of O(N) since the number of peaks can be N/2 in the worst case.

Hence, in my opinion, the complexity of your algorithm is O(NlogN) at best but it must be sufficient to clear all test cases.

Nonprofessional answered 2/1, 2014 at 16:22 Comment(1)
Would you have an idea as to what the Nloglog(N) solution is that the problem is expecting? (Codility does run the program and claim that my solution is Nloglog(N), but I would guess that it's quite difficult to differentiate statistically an Nloglog(N) solution from NlogN without using extremely large datasets)Semiprofessional
G
0

@radicality

There's at least one point where you can optimize the number of passes in the second loop to O(sqrt(N)) -- collect divisors of N and iterate through them only.

That will make your algo a little less "brute force".

Problem definition allows for O(N) space complexity. You can store divisors without violating this condition.

Gongorism answered 19/1, 2014 at 18:18 Comment(0)
M
0

This is my solution based on prefix sums. Hope it helps:

class Solution {
    public int solution(int[] A) {
        int n = A.length;
        int result = 1;
        if (n < 3)
            return 0;

        int[] prefixSums = new int[n];
        for (int i = 1; i < n-1; i++)
            if (A[i] > A[i-1] && A[i] > A[i+1])
                prefixSums[i] = prefixSums[i-1] + 1;
            else 
                prefixSums[i] = prefixSums[i-1];
        prefixSums[n-1] = prefixSums[n-2];

        if (prefixSums[n-1] <= 1)
            return prefixSums[n-1];

        for (int i = 2; i <= prefixSums[n-2]; i++) {
            if (n % i != 0)
                continue;
            int prev = 0;
            boolean containsPeak = true;
            for (int j = n/i - 1; j < n; j += n/i) {
                if (prefixSums[j] == prev) {
                    containsPeak = false;
                    break;
                }
                prev = prefixSums[j];                   
            }
            if (containsPeak)
                result = i;
        }

        return result;
    }
}
Marilee answered 11/2, 2019 at 15:39 Comment(0)
O
0
def solution(A):
    length = len(A)
    if length <= 2:
        return 0

    peek_indexes = []
    for index in range(1, length-1):
        if A[index] > A[index - 1] and A[index] > A[index + 1]:
            peek_indexes.append(index)

    for block in range(3, int((length/2)+1)):
        if length % block == 0:
            index_to_check = 0
            temp_blocks = 0
            for peek_index in peek_indexes:
                if peek_index >= index_to_check and peek_index < index_to_check + block:
                    temp_blocks += 1
                    index_to_check = index_to_check + block
            if length/block == temp_blocks:
                return temp_blocks

    if len(peek_indexes) > 0:
        return 1
    else:
        return 0
print(solution([1, 2, 3, 4, 3, 4, 1, 2, 3, 4, 6, 2, 1, 2, 5, 2]))
Only answered 21/10, 2019 at 8:32 Comment(0)
N
0

I just found the factors at first, then just iterated in A and tested all number of blocks to see which is the greatest block division.

This is the code that got 100 (in java)

https://app.codility.com/demo/results/training9593YB-39H/

Nafis answered 22/12, 2019 at 6:46 Comment(0)
H
0

A javascript solution with complexity of O(N * log(log(N))).

function solution(A) {
    let N = A.length;
    if (N < 3) return 0;
    let peaks = 0;
    let peaksTillNow = [ 0 ];
    let dividers = [];
    for (let i = 1; i < N - 1; i++) {
        if (A[i - 1] < A[i] && A[i] > A[i + 1]) peaks++;
        peaksTillNow.push(peaks);
        if (N % i === 0) dividers.push(i);
    }
    peaksTillNow.push(peaks);
    if (peaks === 0) return 0;
    let blocks;
    let result = 1;
    for (blocks of dividers) {
        let K = N / blocks;
        let prevPeaks = 0;
        let OK = true;
        for (let i = 1; i <= blocks; i++) {
            if (peaksTillNow[i * K - 1] > prevPeaks) {
                prevPeaks = peaksTillNow[i * K - 1];
            } else {
                OK = false;
                break;
            }
        }
        if (OK) result = blocks;
    }
    return result;
}
Hodeida answered 3/8, 2020 at 12:51 Comment(0)
S
0

Solution with C# code

public int GetPeaks(int[] InputArray)
        { 
            List<int> lstPeaks = new List<int>();
            lstPeaks.Add(0);
            for (int Index = 1; Index < (InputArray.Length - 1); Index++)
            {
                if (InputArray[Index - 1] < InputArray[Index] && InputArray[Index] > InputArray[Index + 1])
                {
                    lstPeaks.Add(1);
                }
                else
                {
                    lstPeaks.Add(0);
                }
            }
            lstPeaks.Add(0);

 
            int totalEqBlocksWithPeaks = 0;
            for (int factor = 1; factor <= InputArray.Length; factor++)
            {
                if (InputArray.Length % factor == 0)
                {
                    int BlockLength = InputArray.Length / factor;
                    int BlockCount = factor;

                    bool isAllBlocksHasPeak = true;
                    for (int CountIndex = 1; CountIndex <= BlockCount; CountIndex++)
                    {
                        int BlockStartIndex = CountIndex == 1 ? 0 : (CountIndex - 1) * BlockLength;
                        int BlockEndIndex = (CountIndex * BlockLength) - 1;

                        if (!(lstPeaks.GetRange(BlockStartIndex, BlockLength).Sum() > 0))
                        {
                            isAllBlocksHasPeak = false;
                        }
                    }

                    if (isAllBlocksHasPeak)
                        totalEqBlocksWithPeaks++; 
                }
            }
            return totalEqBlocksWithPeaks; 
        } 
Saharanpur answered 1/7, 2021 at 10:57 Comment(0)
M
0

There is actually an O(n) runtime complexity solution for this task, so this is a humble attempt to share that.

The trick to go from the proposed O(n * loglogn) solutions to O(n) is to calculate the maximum gap between any two peaks (or a leading or trailing peak to the corresponding endpoint).

This can be done while building the peak hash in the first O(n) loop.

Then, if the gap is 'g' between two consecutive peaks, then the minimum group size must be 'g/2'. It will simply be 'g' between start and first peak, or last peak and end. Also, there will be at least one peak in any group from group size 'g', so the range to check for is: g/2, 1+g/2, 2+g/2, ... g.

Therefore, the runtime is the sum over d = g/2, g/2+1, ... g) * n/d where 'd' is the divisor'.

(sum over d = g/2, 1 + g/2, ... g) * n/d = n/(g/2) + n/(1 + g/2) + ... + (n/g)

if g = 5, this n/5 + n/6 + n/7 + n/8 + n/9 + n/10 = n(1/5+1/6+1/7+1/8+1/9+1/10)

If you replace each item with the largest element, then you get sum <= n * (1/5 + 1/5 + 1/5 + 1/5 + 1/5) = n

Now, generalising this, every element is replaced with n / (g/2).

The number of items from g/2 to g is 1 + g/2 since there are (g - g/2 + 1) items.

So, the whole sum is: n/(g/2) * (g/2 + 1) = n + 2n/g < 3n.

Therefore, the bound on the total number of operations is O(n).

The code, implementing this in C++, is here:

int solution(vector<int> &A)
{
  int sizeA = A.size();
  vector<bool> hash(sizeA, false);
  int min_group_size = 2;
 
  int pi = 0; 
  for (int i = 1, pi = 0; i < sizeA - 1; ++i) {
    const int e = A[i];
    if (e > A[i - 1] && e > A[i + 1]) {
      hash[i] = true;
      int diff = i - pi;
      if (pi) diff /= 2;
      if (diff > min_group_size) min_group_size = diff;
      pi = i;
    }
  }
  min_group_size = min(min_group_size, sizeA - pi);

  vector<int> hash_next(sizeA, 0);
  for (int i = sizeA - 2; i >= 0; --i) {
    hash_next[i] = hash[i] ? i : hash_next[i + 1];
  }

  for (int group_size = min_group_size; group_size <= sizeA; ++group_size) {
    if (sizeA % group_size != 0) continue;
    int number_of_groups = sizeA / group_size;
    int group_index = 0;
    for (int peak_index = 0; peak_index < sizeA; peak_index = group_index * group_size) {
      peak_index = hash_next[peak_index];
      if (!peak_index) break;
      int lower_range = group_index * group_size;
      int upper_range = lower_range + group_size - 1;
      if (peak_index > upper_range) {
        break;
      }
      ++group_index;
    }

    if (number_of_groups == group_index) return number_of_groups;
  }

  return 0;
}
Michigan answered 6/11, 2021 at 11:34 Comment(2)
Codility says this is O(N*log(log(N)), too...Bane
@mikhailcazi: yes, you are right that Codility cannot guess finer differences like that. It can mostly detect the gross ones reliably.Attested
A
0
var prev, curr, total = 0;

for (var i=1; i<A.length; i++) {
    if (curr == 0) {
        curr = A[i];
    } else {
        if(A[i] != curr) {
            if (prev != 0) {
                if ((prev < curr && A[i] < curr) || (prev > curr && A[i] > curr)) {
                    total += 1;
                }
            } else {
                prev = curr;
                total += 1;
            }
            prev = curr;
            curr = A[i];
        }
    }
    
}
if(prev != curr) {
    total += 1;
}

return total;
Acanthoid answered 15/5, 2022 at 8:28 Comment(0)
D
-1

I agree with GnomeDePlume answer... the piece on looking for the divisors on the proposed solution is O(N), and that could be decreased to O(sqrt(N)) by using the algorithm provided on the lesson text.

So just adding, here is my solution using Java that solves the problem on the required complexity.

Be aware, it has way more code then yours - some cleanup (debug sysouts and comments) would always be possible :-)

public int solution(int[] A) {
    int result = 0;

    int N = A.length;

    // mark accumulated peaks
    int[] peaks = new int[N];
    int count = 0;
    for (int i = 1; i < N -1; i++) {
        if (A[i-1] < A[i] && A[i+1] < A[i])
            count++;    
        peaks[i] = count;
    }
    // set peaks count on last elem as it will be needed during div checks
    peaks[N-1] = count;

    // check count
    if (count > 0) {
        // if only one peak, will need the whole array
        if (count == 1)
            result = 1;
        else {

            // at this point (peaks > 1) we know at least the single group will satisfy the criteria
            // so set result to 1, then check for bigger numbers of groups
            result = 1;

            // for each divisor of N, check if that number of groups work
            Integer[] divisors = getDivisors(N);

            // result will be at least 1 at this point
            boolean candidate;
            int divisor, startIdx, endIdx;
            // check from top value to bottom - stop when one is found
            // for div 1 we know num groups is 1, and we already know that is the minimum. No need to check.
            // for div = N we know it's impossible, as all elements would have to be peaks (impossible by definition)
            for (int i = divisors.length-2; i > 0; i--) {
                candidate = true;
                divisor = divisors[i];

                for (int j = 0; j < N; j+= N/divisor) {
                    startIdx = (j == 0 ? j : j-1);
                    endIdx = j + N/divisor-1;

                    if (peaks[startIdx] == peaks[endIdx]) {
                        candidate = false;
                        break;
                    }
                }

                // if all groups had at least 1 peak, this is the result!
                if (candidate) {
                    result = divisor;
                    break;
                }
            }

        }
    }

    return result;
}

// returns ordered array of all divisors of N
private Integer[] getDivisors(int N) {
    Set<Integer> set = new TreeSet<Integer>();

    double sqrt = Math.sqrt(N);
    int i = 1;
    for (; i < sqrt; i++) {
        if (N % i == 0) {
            set.add(i); 
            set.add(N/i);
        }
    }
    if (i * i == N)
        set.add(i);

    return set.toArray(new Integer[]{});
}

Thanks, Davi

Dosi answered 20/2, 2016 at 0:45 Comment(2)
This question asks for an explanation of the asker's code, not another implementation.Burdock
You are correct... but GnomeDePlume already answered the question on complexity. My intention was to provide a Java implementation solving the problem with the correct complexity. I'll change the answer to reflect that. Thank you for pointing that out!Dosi

© 2022 - 2024 — McMap. All rights reserved.