The trick is the following: You get updates at random times via void update(int time, float value)
. However you also need to also track when an update falls off the time window, so you set an "alarm" which called at time + N
which removes the previous update from being ever considered again in the computation.
If this happens in real-time you can request the operating system to make a call to a method void drop_off_oldest_update(int time)
to be called at time + N
If this is a simulation, you cannot get help from the operating system and you need to do it manually. In a simulation you would call methods with the time supplied as an argument (which does not correlate with real time). However, a reasonable assumption is that the calls are guaranteed to be such that the time arguments are increasing. In this case you need to maintain a sorted list of alarm time values, and for each update
and read
call you check if the time argument is greater than the head of the alarm list. While it is greater you do the alarm related processing (drop off the oldest update), remove the head and check again until all alarms prior to the given time are processed. Then do the update call.
I have so far assumed it is obvious what you would do for the actual computation, but I will elaborate just in case. I assume you have a method float read (int time)
that you use to read the values. The goal is to make this call as efficient as possible. So you do not compute the moving average every time the read
method is called. Instead you precompute the value as of the last update or the last alarm, and "tweak" this value by a couple of floating point operations to account for the passage of time since the last update. (i. e. a constant number of operations except for perhaps processing a list of piled up alarms).
Hopefully this is clear -- this should be a quite simple algorithm and quite efficient.
Further optimization: one of the remaining problems is if a large number of updates happen within the time window, then there is a long time for which there are neither reads nor updates, and then a read or update comes along. In this case, the above algorithm will be inefficient in incrementally updating the value for each of the updates that is falling off. This is not necessary because we only care about the last update beyond the time window so if there is a way to efficiently drop off all older updates, it would help.
To do this, we can modify the algorithm to do a binary search of updates to find the most recent update before the time window. If there are relatively few updates that needs to be "dropped" then one can incrementally update the value for each dropped update. But if there are many updates that need to be dropped then one can recompute the value from scratch after dropping off the old updates.
Appendix on Incremental Computation: I should clarify what I mean by incremental computation above in the sentence "tweak" this value by a couple of floating point operations to account for the passage of time since the last update. Initial non-incremental computation:
start with
sum = 0;
updates_in_window = /* set of all updates within window */;
prior_update' = /* most recent update prior to window with timestamp tweaked to window beginning */;
relevant_updates = /* union of prior_update' and updates_in_window */,
then iterate over relevant_updates
in order of increasing time:
for each update EXCEPT last {
sum += update.value * time_to_next_update;
},
and finally
moving_average = (sum + last_update * time_since_last_update) / window_length;
.
Now if exactly one update falls off the window but no new updates arrive, adjust sum
as:
sum -= prior_update'.value * time_to_next_update + first_update_in_last_window.value * time_from_first_update_to_new_window_beginning;
(note it is prior_update'
which has its timestamp modified to start of last window beginning). And if exactly one update enters the window but no new updates fall off, adjust sum
as:
sum += previously_most_recent_update.value * corresponding_time_to_next_update.
As should be obvious, this is a rough sketch but hopefully it shows how you can maintain the average such that it is O(1) operations per update on an amortized basis. But note further optimization in previous paragraph. Also note stability issues alluded to in an older answer, which means that floating point errors may accumulate over a large number of such incremental operations such that there is a divergence from the result of the full computation that is significant to the application.