Exponential Moving Average Sampled at Varying Times
Asked Answered
S

8

57

I have a continuous value for which I'd like to calculate an exponential moving average. Normally I'd just use the standard formula for this:

  • Sn = αY + (1-α)Sn-1

where Sn is the new average, α is the alpha, Y is the sample, and Sn-1 is the previous average.

Unfortunately, due to various issues I don't have a consistent sample time. I may know I can sample at the most, say, once per millisecond, but due to factors out of my control, I may not be able to take a sample for several milliseconds at a time. A likely more common case, however, is that I simple sample a bit early or late: instead of sampling at 0, 1 and 2 ms. I sample at 0, 0.9 and 2.1 ms. I do anticipate that, regardless of delays, my sampling frequency will be far, far above the Nyquist limit, and thus I need not worry about aliasing.

I reckon that I can deal with this in a more-or-less reasonable way by varying the alpha appropriately, based on the length of time since the last sample.

Part of my reasoning that this will work is that the EMA "interpolates linearly" between the previous data point and the current one. If we consider calculating an EMA of the following list of samples at intervals t: [0,1,2,3,4]. We should get the same result if we use interval 2t, where the inputs become [0,2,4], right? If the EMA had assumed that, at t2 the value had been 2 since t0, that would be the same as the interval t calculation calculating on [0,2,2,4,4], which it's not doing. Or does that make sense at all?

Can someone tell me how to vary the alpha appropriately? "Please show your work." I.e., show me the math that proves that your method really is doing the right thing.

Sniffy answered 21/6, 2009 at 13:5 Comment(2)
You shouldn't get the same EMA for different input. Think of EMA as a filter, sampling at 2t is equivalent to down sampling, and the filter is going to give a different output. This clear to me since [0,2,4] contains higher frequency components than [0,1,2,3,4]. Unless the question is, how do I change the filter on the fly to make it give the same output. Perhaps I am missing something?Lilybelle
But the input is not different, it's just sampled less often. [0,2,4] at intervals 2t is like [0,,2,,4] at intervals t, where the _ indicates that the sample is ignoredSniffy
P
62

This answer based on my good understanding of low-pass filters ("exponential moving average" is really just a single-pole lowpass filter), but my hazy understanding of what you're looking for. I think the following is what you want:

First, you can simplify your equation a little bit (looks more complicated but it's easier in code). I'm going to use "Y" for output and "X" for input (instead of S for output and Y for input, as you have done).

Yn = αX + (1-α)Yn-1 → Yn = Yn-1 + α(X - Yn-1)

which codes to:

 Y += alpha * (X-Y);

Second, the value of α here is "equal" to 1-e-Δt/τ where Δt is the time between samples, and τ is the time constant of the low-pass filter. I say "equal" in quotes because this works well when Δt/τ is small compared to 1, and α = 1-e-Δt/τ ≈ Δt/τ. (But not too small: you'll run into quantizing issues, and unless you resort to some exotic techniques you usually need an extra N bits of resolution in your state variable S, where N = -log2(α). ) For larger values of Δt/τ the filtering effect starts to disappear, until you get to the point where α is close to 1 and you're basically just assigning the input to the output.

This should work properly with varying values of Δt (the variation of Δt is not very important as long as alpha is small, otherwise you will run into some rather weird Nyquist issues / aliasing / etc.), and if you are working on a processor where multiplication is cheaper than division, or fixed-point issues are important, precalculate ω = 1/τ, and consider trying to approximate the formula for α.

If you really want to know how to derive the formula

α = 1-e-Δt/τ

then consider its differential equation source:

Y + τ dY/dt = X

which, when X is a unit step function, has the solution Y = 1 - e-t/τ. For small values of Δt, the derivative can be approximated by ΔY/Δt, yielding

Y + τ ΔY/Δt = X

ΔY/Δt = (X-Y)/τ

ΔY = (X-Y)(Δt/τ) = α(X-Y)

and the "extrapolation" of α = 1-e-Δt/τ comes from trying to match up the behavior with the unit step function case.

Psychro answered 22/6, 2009 at 15:21 Comment(4)
Yes, this exactly solves my problem, which was basically to introduce delta-t into the equation. I greatly appreciate the extra implementation hints too, as well as the concise alternative description, "single-pole low-pass filter."Sniffy
Would you please elaborate on the "trying to match up the behavior" part? I understand your continuous-time solution Y = 1 - exp(-t/τ) and its generalization to a scaled step function with magnitude x and initial condition y(0), but I'm not seeing how to put these ideas together to achieve your result.Delete
evaluate both continuous and discrete versions at t = (delta t) = the first discrete timestep, and compute alpha so that the continuous and discrete results have the same valuePsychro
If Δt goes to 0, then α goes to 0 as well -- so Y is unchanged. This seems to be correct only if Y represents a large number of events already. If for instance the only two events are simultaneous, shouldn't the "correct" answer be the average of the two?Tyner
R
7

Have a look here: http://www.eckner.com/research.html

Look at the second link: ""Algorithms for Unevenly-Spaced Time Series: Moving Averages and Other Rolling Operators"

The document describes exactly the programming algorithms you need, I think.

Repatriate answered 8/8, 2013 at 8:22 Comment(1)
Links to external resources are encouraged, but please add context around the link so your fellow users will have some idea what it is and why it’s there. Always quote the most relevant part of an important link, in case the target site is unreachable or goes permanently offline.Lookthrough
S
2

This is not a complete answer, but may be the start of one. It's as far as I got with this in an hour or so of playing; I'm posting it as an example of what I'm looking for, and perhaps an inspiration to others working on the problem.

I start with S0, which is the average resulting from the previous average S-1 and the sample Y0 taken at t0. (t1 - t0) is my sample interval and α is set to whatever is appropriate for that sample interval and the period over which I wish to average.

I considered what happens if I miss the sample at t1 and instead have to make do with the sample Y2 taken at t2? Well, we can start by expanding the equation to see what would have happened if we had had Y1:

  • S2 = αY2 + (1-α)S1, where S1 = αY1 + (1-α)S0

Substituting:

  • S2 = αY2 + (1-α)(αY1 + (1-α)S0)
  • S2 = αY2 + (1-α)αY1 + (1-α)(1-α)S0
  • S2 = αY2 + (1-α)αY1 + (1-α)2S0

I notice that the series seems to extend infinitely this way, because we can substitute the Sn in the right-hand side indefinitely:

  • S2 = αY2 + (1-α)αY1 + (1-α)2(αY0 + (1-α)S-1)
  • S2 = αY2 + (1-α)αY1 + (1-α)2αY0 + (1-α)3S-1
  • etc.

Ok, so it's not really a polynomial (silly me), but if we multiply the initial term by one, we then see a pattern:

  • S2 = (1-α)0αY2 + (1-α)αY1 + (1-α)2αY0 + (1-α)3S-1

Hm: it's an exponential series. Quelle surprise! Imagine that coming out of the equation for an exponential moving average!

So anyway, I have this x0 + x1 + x2 + x3 + ... thing going, and I'm sure I'm smelling e or a natural logarithm kicking around here, but I can't remember where I was heading next before I ran out of time.

Sniffy answered 21/6, 2009 at 14:21 Comment(0)
F
1

Any answer to this question, or any proof of correctness of such an answer, highly depends on the data you're measuring.

If your samples were taken at t0=0ms , t1=0.9ms and t2=2.1ms , but your choice of α is based on 1-ms-intervals, and therefore you want a locally adjusted αn , the proof of correctness of the choice would mean knowing the sample values at t=1ms and t=2ms .

This leads to the question: Can you interpolate your data resonably to have sane guesses of what in-between values might have been? Or can you even interpolate the average itself?

If neither of these is possible, then as far as I see it, the logical choice of an in-between value Y(t) is the most recently calculated average, i.e. Y(t) ≈ Sn where n is maxmial such that tn<t.

This choice has a simple consequence: Leave α alone, no matter what the time difference was.

If, on the other hand, it is possible to interpolate your values, then this will give you averagable constant-interval samples. Lastly, if it's even possible to interpolate the average itself, that would render the question meaningless.

Frei answered 21/6, 2009 at 15:8 Comment(7)
I would think I can interpolate my data: given that I'm sampling it at discrete intervals, I'm already doing so with a standard EMA! Anyway, assume that I need a "proof" that shows it works as well as a standard EMA, which also has will produce an incorrect result if the values are not changing fairly smoothly between sample periods.Sniffy
But that's what I'm saying: If you consider the EMA an interpolation of your values, you're done if you leave alpha as it is (because inserting the most recent average as Y doesn't change the average). If you say you need something that "works as well as a standard EMA" -- what's wrong with the original? Unless you have more information about the data you're measuring, any local adjustments to alpha will be at best arbitrary.Frei
So you're saying that changing from, say, 1 to 2 over 1 second or 10 seconds should have the same effect on a 100 second moving average?Sniffy
If you fill in the missing values with the value of the current moving average, that's exactly what happens, because S_new = alpha * Y + (1-alpha) * S_old = alpha * S_old + (1-alpha) * S_old = S_old .Frei
Right, which is why I believe you don't want to do it that way. Intuitively, a moving average does not consider the signal to have been constantly the previous average from t(n) to t(n+1), with a sudden change to the new sample at t(n+1), or it would have to change the average much less than it does, because the signal was at a different level from the previous average for only an infinitesimal length of time.Sniffy
As an example, consider S0 = 1, Y0 = 2, alpha = 0.5. The new average after sample Y0, S1, is 1.5. That is a reasonable average if the signal moved steadily from 1 to 2 over the time period; it is not reasonable if the signal stayed at 1 until just before the time period finished, and then suddenly jumped to 2.Sniffy
What you're describing is linear interpolation of the measured values. If you consider that appropriate, why don't you calculate the EMA at constant intervals, taking Y(t) = Y_n + (Y_n+1 - Y_n) * (t_n+1 - t_n) / ( t - t_n) where n and n+1 a the closest measurings before and after the time t = d*i where i is the interval an d a natural number?Frei
F
1

Let's say we would like to make an exponential decaying average on a continuous function. However we don't have all the values of that function, only a few samples. This formula would make a weighted average of the samples that we do have with the weights they would have in the continuous average.

Multipliern = AlphaTimen-Timen-1

Sumn = Valn + Sumn-1*Multipliern

Countn = 1 + Countn-1*Multipliern

Avgn = Sumn/Countn

Fearful answered 21/6, 2009 at 15:53 Comment(3)
Check stackoverflow.com/editing-help, #32157Nunn
You can also have a look at the source code of one of the posts: stackoverflow.com/revisions/…Bratton
I use HTML sup and sub tags to do superscripts and subscripts, and use a * a the beginning of an equation, with a blank line above and below.Sniffy
B
1

By using a slightly different α that is equal to (1-αthe one from the question), the basic formula to add a new value Y to an existing average of S0 looks like this:

S(Y,S0) =

(1-α)Y + αS0 =

Y - αY + αS0 =

Y + α(S0-Y)

If we now add the length of the time interval t and assume that just α depends on that t, that formula looks like this:

S(Y,t,S0) = Y + αt(S0-Y)

Now assume that t = t1 + t2. If the average is created by adding two values of Y for time time intervals t1 and t2, the resulting average looks like this:

S(Y,t2, S(Y,t1,S0)) =

Y + αt2(S(Y,t1,S0) - Y) =

Y + αt2((Y + αt1(S0-Y)) - Y) =

Y + αt2αt1(S0-Y)

If this average should be the same as if the whole t interval would have been added at once, it follows that αt = αt1αt2. A definition of α that fulfills this requirement would be:

αx := Ax     (for some constant A)

Because:

αt = At = At1 + t2 = At1 At2 = αt1αt2

This results in the following averaging function:

S(Y,t,S0) = Y + At(S0-Y)

I haven't really tested this, but if the assumptions I made fit your scenario this looks like an averaging function that can handle variations in the sampling intervals quite well.

Bratton answered 21/6, 2009 at 22:8 Comment(1)
This looks like more or less the solution I had in mind. Unfortunately, I can't quite follow the proof just now, but I'll sit down and look at this more closely in the next day or two.Sniffy
L
0

I would leave the alpha value alone, and fill in the missing data.

Since you don't know what happens during the time when you can't sample, you can fill those samples with 0s, or hold the previous value stable and use those values for the EMA. Or some backward interpolation once you have a new sample, fill in the missing values, and recompute the EMA.

What I am trying to get at is you have an input x[n] which has holes. There is no way to get around the fact you are missing data. So you can use a zero order hold, or set it to zero, or some kind of interpolation between x[n] and x[n+M], where M is the number of missing samples and n the start of the gap. Possibly even using values before n.

Lilybelle answered 21/6, 2009 at 13:35 Comment(6)
From spending an hour or so mucking about a bit with the math for this, I think that simply varying the alpha will actually give me the proper interpolation between the two points that you talk about, but in a much simpler way. Further, I think that varying the alpha will also properply deal with samples taken between the standard sampling intervals. In other words, I'm looking for what you described, but trying to use math to figure out the simple way to do it.Sniffy
I don't think there is such a beast as "proper interpolation". You simply don't know what happened in the time you are not sampling. Good and bad interpolation implies some knowledge of what you missed, since you need to measure against that to judge whether an interpolation is good or bad. Though that said, you can place constraints, i.e. with maximum acceleration, speed, etc. I think if you do know how to model the missing data, then you would just model the missing data, then apply the EMA algorithm with no change, rather than changing alpha. Just my 2c :)Lilybelle
This is exactly what I was getting at in my edit to the question 15 minutes ago: "You simply don't know what happened in the time you are not sampling," but that's true even if you sample at every designated interval. Thus my Nyquist contemplation: so long as you know the wave form doesn't change directions more than every couple of samples, the actual sample interval shouldn't matter, and should be able to vary. The EMA equation seems to me exactly to calculate as if the waveform changed linearly from the last sample value to the current one.Sniffy
I don't think that is quite true. Nyquist's theorem requires requires minimum of 2 samples per period to be able to uniquely identify the signal. If you don't do that, you get aliasing. It would be the same as sampling as f_s1 for a time, then f_s2, then back to f_s1, and you get aliasing in the data when you sample with f_s2 if f_s2 is below the Nyquist limit. I also must confess I do not understand what you mean by "waveform changes linearly from last sample to current one". Could you please explain? Cheers,Steve.Lilybelle
Right. Assume my nominal sample rate is, say, 250 samples per period, but it might go down as low as a dozen samples per period. That still leaves me with a plenty high sampling frequency, I reckon.Sniffy
I've updated the question to discuss the "linear" behaviour of an EMA.Sniffy
N
0

This is similar to an open problem on my todo list. I have one scheme worked out to some extent but do not have mathematical working to back this suggestion yet.

Update & summary: Would like to keep the smoothing factor (alpha) independent of the compensation factor (which I refer as beta here). Jason's excellent answer already accepted here works great for me.

First step.

  • If you can also measure the time since the last sample was taken (in rounded multiples of your constant sampling time -- so 7.8 ms since last sample would be 8 units), that could be used to apply the smoothing multiple times. Apply the formula 8 times in this case. You have effectively made a smoothing biased more towards the current value.

Second step.

  • To get a better smoothing, we need to tweak the alpha while applying the formula 8 times in the previous case.

What will this smoothing approximation miss?

  • It has already missed 7 samples in the example above
  • This was approximated in step 1 with a flattened re-application of the current value an additional 7 times
  • If we define an approximation factor beta that will be applied along with alpha (as alpha*beta instead of just alpha), we will be assuming that the 7 missed samples were changing smoothly between the previous and current sample values.
Nunn answered 21/6, 2009 at 13:35 Comment(5)
I did think about this, but a bit of mucking about with the math got me to the point where I believe that, rather than applying the formula eight times with the sample value, I can do a calculation of a new alpha that will allow me to apply the formula once, and give me the same result. Further, this would automatically deal with the issue of samples offset from exact sample times.Sniffy
The single application is fine. What I am not sure about yet is how good is the approximation of the 7 missing values. If the continuous movement makes the value jitter a lot across the 8 milliseconds, the approximations may be quite off the reality. But, then if you are sampling at 1ms (highest resolution excluding the delayed samples) you have already figured the jitter within 1ms is not relevant. Does this reasoning work for you (I am still trying to convince myself).Nunn
Oh, wait, are you saying that you can compute a new alpha constant that can be used always regardless of the delay in sampling? I feel that is unlikely.Nunn
I'm saying that one can calculate a new alpha for any interval based on the reference alpha and the difference between the actual interval and the reference interval.Sniffy
Right. That is the factor beta from my description. A beta factor would be computed based on the difference interval and the current and previous samples. The new alpha will be (alpha*beta) but it will be used only for that sample. While you seem to be 'moving' the alpha in the formula, I tend towards constant alpha (smoothing factor) and an independently computed beta (a tuning factor) that compensates for samples missed just now.Nunn

© 2022 - 2024 — McMap. All rights reserved.