I would like to construct an extension of pandas.DataFrame
— let's call it SPDF
— which could do stuff above and beyond what a simple DataFrame
can:
import pandas as pd
import numpy as np
def to_spdf(func):
"""Transform generic output of `func` to SPDF.
Returns
-------
wrapper : callable
"""
def wrapper(*args, **kwargs):
res = func(*args, **kwargs)
return SPDF(res)
return wrapper
class SPDF:
"""Special-purpose dataframe.
Parameters
----------
df : pandas.DataFrame
"""
def __init__(self, df):
self.df = df
def __repr__(self):
return repr(self.df)
def __getattr__(self, item):
res = getattr(self.df, item)
if callable(res):
res = to_spdf(res)
return res
if __name__ == "__main__":
# construct a generic SPDF
df = pd.DataFrame(np.eye(4))
an_spdf = SPDF(df)
# call .diff() to obtain another SPDF
print(an_spdf.diff())
Right now, methods of DataFrame
that return another DataFrame
, such as .diff()
in the MWE above, return me another SPDF
, which is great. However, I would also like to trick chained methods such as .resample('M').last()
or .rolling(2).mean()
into producing an SPDF
in the very end. I have failed so far because .rolling()
and the like are of type callable
, and my wrapper to_spdf
tries to construct an SPDF
from their output without 'waiting' for .mean()
or any other last part of the expression. Any ideas how to tackle this problem?
Thanks.
SPDF
. What it will give you a regularDataFrame
is incapable of? – DukeySPDF
"in the very end" and getting the expected result (i.e.isinstance(an_spdf.rolling(2).mean(), SPDF)
returnsTrue
) – Bullate