rolling min and rolling max for C++ boost?
Asked Answered
K

3

7

I have some code that uses a Boost accumulator to track a mean across a rolling window -- "rolling mean". In addition to the rolling mean, I would like to track the minimum and maximum across this same rolling window.

Is there a way to compute a rolling min and rolling max with a Boost accumulator? I am not seeing a way...

I have tried adding min and max tags to the accumulator used for rolling_mean, but that does not give me what I want.

typedef accumulator_set<uint32_t, stats<tag::rolling_mean> > rollingMeanAcc_t;

becomes

typedef accumulator_set<uint32_t, stats<tag::rolling_mean,tag::min,tag::max> > rollingMeanAcc_t;

However, the min and max provided here are calculated over the entire accumulator, rather than limited to the same rolling window as the mean.

The Boost documentation says that min and max are computed across all samples, not limited to a rolling window. They do not appear to provide a way to restrict or weight the samples.

I would like to be able to report mean/min/max all across a rolling window.

I am currently using Boost version 1.48.0. I have looked at the documentation for the latest version (1.54.0) and do not see a rolling min/max implemented there.

I have found a non-Boost way to track a sliding window minimum, but this does not appear to be what I want either. I don't want to remove values just because they are greater/less than the previous min/max, as that would make the rolling_mean inaccurate.

Kinnikinnick answered 9/8, 2013 at 18:10 Comment(0)
W
10

I don't think an accumulator can do a rolling min/max (at least without violating many people's idea of what "accumulator" mean, though I suppose I could be mistaken about what people think it means).

The problem is pretty simple: I think most people expect that an accumulator uses only O(1) data--it doesn't store the data being processed. It can maintain a min or max with O(1) data, because it simply changes the current min/max when a number falls outside the range of the current min/max.

For a window, however, it needs to be prepared to do the opposite: when the current min goes out of the window, it needs to find the new minimum -- the next smallest number in the window. Likewise for maximum, of course.

Now, consider what happens to the minimum if (for example) the input is sorted. Every time an item is removed from the window, we get a different minimum value. In other words, the accumulator would need to store all the data in the window to maintain the current minimum properly. Likewise, for maximum with input sorted in descending order.

In short, you need to store all the data in the window.

On the other hand, if you think of that as still being an accumulator (which may be reasonable) it's fairly straightforward. Certainly you can do this job with something that at least uses the sort of interface you'd normally expect with an accumulator.

Wadsworth answered 9/8, 2013 at 18:23 Comment(3)
Thank you. This make sense. Might there be a way around this? Could I somehow "reset" or overwrite the current max and min stored in the accumulator?Kinnikinnick
It's not true that an accumulator pretty much by definition uses only O(1) data, at least for the boost accumulators library. For example, tail_variates implementation stores std::vector.Voyage
@WeiQiu: The size of that vector is constant, regardless of the size of data being processed, so it's still O(1). But it's true that (at least sort of) the same is true here. I've done a bit of editing to try to clarify. I suppose in the end it comes down to a question of "what does accumulator mean?", to which there probably isn't a precise, universal answer.Wadsworth
T
1

There might be a more clever algorithm (in fact there probably is), but off the top of my head, I would simply store the window in a circular buffer and calculate the rolling min/max on-demand. Cache the result and set a dirty flag when the window changes. That gives an O(1) accumulate operation and an amortized O(1) extract operation with a worst case of O(K), where K is the size of the window.

Tayler answered 22/8, 2013 at 0:9 Comment(1)
Just realize that you are the author of boost accumulators! Thanks for the excellent library. Could you have some comments on the implementation I posted in this thread? Is there any chance for a PR? I also have rolling median implemented.Voyage
V
0

There is nothing stops you from extending boost accumulators to implement your own rolling_min/rolling_max.

Working example for rolling_min:

#include <boost/accumulators/accumulators.hpp>
#include <boost/accumulators/statistics/stats.hpp>
#include <boost/accumulators/statistics/rolling_count.hpp>
#include <boost/accumulators/statistics/rolling_sum.hpp>
#include <iostream>
#include <queue>

namespace boost {
namespace accumulators {
namespace impl {
template <typename Sample>
struct rolling_min : accumulator_base {
    typedef Sample result_type;

    template <typename Args>
    rolling_min(Args const& args){}

    template <typename Args>
    void operator()(Args const& args) {
        if (is_rolling_window_plus1_full(args)) {
            // if full remove the front element
            auto front_value = rolling_window_plus1(args).front();
            deleted_.push(front_value);
        }
        min_queue_.push(args[sample | Sample()]);
    }
    template <typename Args>
    result_type result(Args const& args) const {
        while (!deleted_.empty() && min_queue_.top() == deleted_.top()) {
            min_queue_.pop();
            deleted_.pop();
        }

      return min_queue_.top();
    }

   private:
    // using trick presented here
    // https://mcmap.net/q/1480550/-do-we-have-priority-queue-which-supports-delete-operation-with-same-complexity-as-other-operations
    mutable std::priority_queue<Sample, std::vector<Sample>, std::greater<Sample>> min_queue_;
    mutable std::priority_queue<Sample, std::vector<Sample>, std::greater<Sample>> deleted_;
};
}  // namespace impl

namespace tag {
struct rolling_min : depends_on<rolling_window_plus1> {
    typedef accumulators::impl::rolling_min<mpl::_1> impl;
};
}  // namespace tag

namespace extract {
extractor<tag::rolling_min> const rolling_min = {};
}
using extract::rolling_min;
}  // namespace accumulators
}  // namespace boost

using namespace boost::accumulators;

template <typename Sample, typename Acc>
void demo_print(Sample s, Acc& acc) {
    std::cout << "sample = " << s << ' ' << "cnt = " << rolling_count(acc) << ' ' << "sum = " << rolling_sum(acc) << ' '
              << "rolling min = " << rolling_min(acc) << '\n';
}
int main() {
    accumulator_set<double, features<tag::rolling_min, tag::rolling_count, tag::rolling_sum>> acc(
        tag::rolling_window::window_size = 3);

    acc(1);
    demo_print(1, acc);
    acc(2);
    demo_print(2, acc);
    acc(3);
    demo_print(3, acc);
    acc(4);
    demo_print(4, acc);
    acc(0);
    demo_print(0, acc);
    acc(5);
    demo_print(5, acc);
    acc(8);
    demo_print(8, acc);
    acc(8);
    demo_print(8, acc);
}

Testd using c++17 with gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)

Voyage answered 11/10, 2023 at 6:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.