How to choose the correct arguments of statsmodels STL function?

# seasonal=13 based on example in the statsmodels user guide decomp = STL(synth.value, period=160, seasonal=13).fit() fig, ax = plt.subplots(3,1, figsize=(12,6)) decomp.trend.plot(title='Trend', ax=ax[0]) decomp.seasonal.plot(title='Seasonal', ax=ax[1]) decomp.resid.plot(title='Residual', ax=ax[2]) plt.tight_layout() plt.show()

I had the same question. After tracing some of their codebase, I have found the following. This may help:

Statsmodels expects a DatetimeIndex'd DataFrame.
This DatetimeIndex can have a frequency. You can either resample your data with Pandas, or explicitly set a frequency in your index. You can check df.index, look for the freq attribute.

This leads to two situations:

Your index has frequency set

If you have set a frequency in your index, statsmodels will inherit this frequency and automatically use this to determine a period. It makes use of the freq_to_period method internally, defined here in the tsatools submodule.

To summarise what this does: The period is the expected periodicity of your seasonal component, translated back to a year..

In other words: "how often your seasonal cycle will repeat itself in a year". For reference, read the note on the freq_to_period method definition: Annual maps to 1, quarterly maps to 4, monthly to 12, weekly to 52.

This is both done for the method seasonal_decompose here, as well as for STL here.

Your index has no frequency set

It gets a bit more complicated if your data does not have a freq attribute set. The seasonal_decompose checks whether it can find an inferred_freq attribute of your index set here, STL takes the same approach here.

This inferred_freq was set using the pandas function infer_freq, which is defined in the Pandas package here, to Infer the most likely frequency given the input index.. Pandas automatically gives a DataFrame with a DatetimeIndex an index.inferred_freq attribute by default, if you have at least 3 elements.

TLDR: The period parameter should be set to the amount of times you expect the seasonal cycle to re-occur within a year. You can explicitly set this, or otherwise statsmodels will automatically infer this from the freq attribute of your datetimeindex. If the freq attribute is None, it will depend on Pandas' index.inferred_freq attribute to determine the frequency, and then convert this to pre-set periodicity.

Parameters

Your index has frequency set

Your index has no frequency set

Recommended topics

Hot tags