Can min/max of moving window achieve in O(N)?
Asked Answered
R

3

17

I have input array A

 A[0], A[1], ... , A[N-1]

I want function Max(T,A) which return B represent max value on A over previous moving window of size T where

 B[i+T] = Max(A[i], A[i+T])

By using max heap to keep track of max value on current moving windows A[i] to A[i+T], this algorithm yields O(N log(T)) worst case.

I would like to know is there any better algorithm? Maybe an O(N) algorithm

Ratchet answered 30/8, 2012 at 5:1 Comment(3)
If A is fixed and T varies, you may do a O(N*log(N)) preparation and then for every T, you can get B in O(N) time.Amylolysis
@Topro Sounds a good try! Can you put preparation step on Answer? Thanks!Ratchet
people.cs.uct.ac.za/~ksmith/articles/…Wore
P
39

O(N) is possible using Deque data structure. It holds pairs (Value; Index).

at every step:

  if (!Deque.Empty) and (Deque.Head.Index <= CurrentIndex - T) then 
     Deque.ExtractHead;
  //Head is too old, it is leaving the window

  while (!Deque.Empty) and (Deque.Tail.Value > CurrentValue) do
     Deque.ExtractTail;
  //remove elements that have no chance to become minimum in the window

  Deque.AddTail(CurrentValue, CurrentIndex); 
  CurrentMin = Deque.Head.Value
  //Head value is minimum in the current window
Pippy answered 30/8, 2012 at 10:44 Comment(6)
bravo, applies to more general casesAmylolysis
Shouldn't the top "if" be a "while", in case we can prune multiple values from the head? For example if the head has two elements with similar index values and the new index value is far enough forward that it means the two old elements are now expired.Armanda
@John Zwinck No, index is unique, it is 'age' of element. And it is enough to check for '=', not for '<=' in the second condtitionPippy
implemented in readable javascript: gist.github.com/shimondoodkin/f274d6e17c66a8b72779Houlihan
@Cris Luengo No. Every item is treated twice - pushed to tail and extracted from from head or from tail , so O(N)Pippy
@Cris Luengo sorting isn't needed. Minimum item is sitting on the queue head until better item replaces it, removing all items (2-nd operator) or it becomes too old (1-st operator)Pippy
A
6

it's called RMQ(range minimum query). Actually i once wrote an article about that(with c++ code). See http://attiix.com/2011/08/22/4-ways-to-solve-%C2%B11-rmq/

or you may prefer the wikipedia, Range Minimum Query

after the preparation, you can get the max number of any given range in O(1)

Amylolysis answered 30/8, 2012 at 9:51 Comment(2)
So it requires additional O(N log(N)) space? It took me quite while to understand the whole things for preparation steps which is essentially dynamic programming construction :) But, yes, I do need to vary T a lot. Does this approach have advantages over the other?Ratchet
@dondonchi return you the minimum value for any query(l, r), no need to fix T.Amylolysis
T
6

There is a sub-field in image processing called Mathematical Morphology. The operation you are implementing is a core concept in this field, called dilation. Obviously, this operation has been studied extensively and we know how to implement it very efficiently.

The most efficient algorithm for this problem was proposed in 1992 and 1993, independently by van Herk, and Gil and Werman. This algorithm needs exactly 3 comparisons per sample, independently of the size of T.

Some years later, Gil and Kimmel further refined the algorithm to need only 2.5 comparisons per sample. Though the increased complexity of the method might offset the fewer comparisons (I find that more complex code runs more slowly). I have never implemented this variant.

The HGW algorithm, as it's called, needs two intermediate buffers of the same size as the input. For ridiculously large inputs (billions of samples), you could split up the data into chunks and process it chunk-wise.

In sort, you walk through the data forward, computing the cumulative max over chunks of size T. You do the same walking backward. Each of these require one comparison per sample. Finally, the result is the maximum over one value in each of these two temporary arrays. For data locality, you can do the two passes over the input at the same time.

I guess you could even do a running version, where the temporary arrays are of length 2*T, but that would be more complex to implement.

  • van Herk, "A fast algorithm for local minimum and maximum filters on rectangular and octagonal kernels", Pattern Recognition Letters 13(7):517-521, 1992 (doi)

  • Gil, Werman, "Computing 2-D min, median, and max filters", IEEE Transactions on Pattern Analysis and Machine Intelligence 15(5):504-507 , 1993 (doi)

  • Gil, Kimmel, "Efficient dilation, erosion, opening, and closing algorithms", IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12):1606-1617, 2002 (doi)

(Note: cross-posted from this related question on Code Review.)

Trouvaille answered 1/4, 2018 at 17:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.