pandas resample documentation
Asked Answered
E

3

233

So I completely understand how to use resample, but the documentation does not do a good job explaining the options.

So most options in the resample function are pretty straight forward except for these two:

  • rule : the offset string or object representing target conversion
  • how : string, method for down- or re-sampling, default to ‘mean’

So from looking at as many examples as I found online I can see for rule you can do 'D' for day, 'xMin' for minutes, 'xL' for milliseconds, but that is all I could find.

for how I have seen the following: 'first', np.max, 'last', 'mean', and 'n1n2n3n4...nx' where nx is the first letter of each column index.

So is there somewhere in the documentation that I am missing that displays every option for pandas.resample's rule and how inputs? If yes, where because I could not find it. If no, what are all the options for them?

Eady answered 8/6, 2013 at 16:9 Comment(3)
For Google's wanderers, for resampling using how='last' and how='first': don't forget to add closed='left', label='left'. linkChemurgy
@NasserAl-Wohaibi I am fairly confident your comment above is an indication that these options can help fully answer the following question. Have you encountered this problem before? #26247801Anemograph
how='last' is deprecated now in favor of resample(...).last()Levina
S
398
B         business day frequency
C         custom business day frequency (experimental)
D         calendar day frequency
W         weekly frequency
M         month end frequency
SM        semi-month end frequency (15th and end of month)
BM        business month end frequency
CBM       custom business month end frequency
MS        month start frequency
SMS       semi-month start frequency (1st and 15th)
BMS       business month start frequency
CBMS      custom business month start frequency
Q         quarter end frequency
BQ        business quarter endfrequency
QS        quarter start frequency
BQS       business quarter start frequency
A         year end frequency
BA, BY    business year end frequency
AS, YS    year start frequency
BAS, BYS  business year start frequency
BH        business hour frequency
H         hourly frequency
T, min    minutely frequency
S         secondly frequency
L, ms     milliseconds
U, us     microseconds
N         nanoseconds

See the timeseries documentation. It includes a list of offsets (and 'anchored' offsets), and a section about resampling.

Note that there isn't a list of all the different how options, because it can be any NumPy array function and any function that is available via groupby dispatching can be passed to how by name.

Suribachi answered 8/6, 2013 at 16:20 Comment(4)
" ... because it can be any NumPy array function and..." - yeah, I read that in the docs, but is there any documentation anywhere explaining what exactly this function is supposed to do and what it's got to do with the resampling...? I feel pretty lost here.Coulomb
This should be linked to in all relevant documentation areas, like resample. Here is the link to the abbreviations: pandas.pydata.org/pandas-docs/stable/…Mertz
Added a pull request to improve the docs github.com/pandas-dev/pandas/pull/30252Malm
@Mertz +∞ for you. I also sometimes wonder if anyone knows that ISO8601 has a way to represent durations.Aphrodite
F
69

There's more to it than this, but you're probably looking for this list:

B   business day frequency
C   custom business day frequency (experimental)
D   calendar day frequency
W   weekly frequency
M   month end frequency
BM  business month end frequency
MS  month start frequency
BMS business month start frequency
Q   quarter end frequency
BQ  business quarter endfrequency
QS  quarter start frequency
BQS business quarter start frequency
A   year end frequency
BA  business year end frequency
AS  year start frequency
BAS business year start frequency
H   hourly frequency
T   minutely frequency
S   secondly frequency
L   milliseconds
U   microseconds

Source: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases

Fetter answered 6/11, 2013 at 19:43 Comment(3)
Why isn't there the 'Min' (like the '5Min' used in the documentation)?Deese
@zyuang, only short formats are displayed here : "ms" is also absent from the list for instanceWatson
It looks like the updated the documentation. Now it shows the "min" T, min minutely | frequency, and also the "ms" L, ms | millisecondsDeboer
D
0

If you not sure what you will get, use this function:

from pandas.tseries.frequencies import to_offset
print(to_offset("7D")) # <7 * Days>
print(to_offset("W")) # <Week: weekday=6>
print(to_offset("M")) # <MonthEnd>
print(to_offset("m")) # <MonthEnd>
print(to_offset("min")) # <Minute>

for example, uppercase and lowercase are the same (not like the usual M=Month and m=minute)

Be aware

that therefore this is not the same and gives you different results:

s.resample("7d").mean()
s.resample("W").mean() # is not the same!

The reason you can see a here: "Warning: The default values for label and closed is ‘left’ for all frequency offsets except for ‘M’, ‘A’, ‘Q’, ‘BM’, ‘BA’, ‘BQ’, and ‘W’ which all have a default of ‘right’."

Deboer answered 13/10, 2022 at 15:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.