First question:
I am working with pandas' DataFrames and I am frequently running the same routines as part of data pre-processing and other things. I'd like to write some of these routines as methods in a class called ExtendedDataframe
that extends pandas.DataFrame
. I don't know how to go about this. So far, I'm not writing any __init__
in my new class so that it's inherited from pandas.DataFrame
:
import pandas
class ExtendedDataframe(pandas.DataFrame):
def some_method(self):
blahblah
This apparently enables me to create an instance of ExtendedDataframe
by inheritance. But I'm usually loading data through something like pandas.read_csv
which returns a classic DataFrame
. How can I do to be able to load such csv data and at some point turn it into an ExtendedDataframe
to use my own methods, on top of those provided on standard DataFrame
? It's fine if the loading phase returns a standard DataFrame
that I then transform into an ExtendedDataframe
.
Second question:
Not all pandas' functionalities that I use are DataFrame methods. Some are functions, such as pandas.merge
, that take DataFrames as arguments. How can I extend the use of such functions to instances of my ExtendedDataframe
class? In otherwords, if df1
and df2
are two instances of ExtendedDataframe
, how do I make
pandas.merge([df1, df2], ...)
work just like it would with standard instances of DataFrame
?