Pandas: Elementwise multiplication of two dataframes
Asked Answered
S

5

30

I know how to do element by element multiplication between two Pandas dataframes. However, things get more complicated when the dimensions of the two dataframes are not compatible. For instance below df * df2 is straightforward, but df * df3 is a problem:

df = pd.DataFrame({'col1' : [1.0] * 5, 
                   'col2' : [2.0] * 5, 
                   'col3' : [3.0] * 5 }, index = range(1,6),)
df2 = pd.DataFrame({'col1' : [10.0] * 5, 
                    'col2' : [100.0] * 5, 
                    'col3' : [1000.0] * 5 }, index = range(1,6),)
df3 = pd.DataFrame({'col1' : [0.1] * 5}, index = range(1,6),)

df.mul(df2, 1) # element by element multiplication no problems

df.mul(df3, 1) # df(row*col) is not equal to df3(row*col)
   col1  col2  col3
1   0.1   NaN   NaN
2   0.1   NaN   NaN
3   0.1   NaN   NaN
4   0.1   NaN   NaN
5   0.1   NaN   NaN

In the above situation, how can I multiply every column of df with df3.col1?

My attempt: I tried to replicate df3.col1 len(df.columns.values) times to get a dataframe that is of the same dimension as df:

df3 = pd.DataFrame([df3.col1 for n in range(len(df.columns.values)) ])
df3
        1    2    3    4    5
col1  0.1  0.1  0.1  0.1  0.1
col1  0.1  0.1  0.1  0.1  0.1
col1  0.1  0.1  0.1  0.1  0.1

But this creates a dataframe of dimensions 3 * 5, whereas I am after 5*3. I know I can take the transpose with df3.T() to get what I need but I think this is not that the fastest way.

Sodomite answered 9/1, 2014 at 14:25 Comment(2)
Does this answer your question? how to multiply multiple columns by a column in PandasAmata
The answer in there ^ is much better.Amata
K
40
In [161]: pd.DataFrame(df.values*df2.values, columns=df.columns, index=df.index)
Out[161]: 
   col1  col2  col3
1    10   200  3000
2    10   200  3000
3    10   200  3000
4    10   200  3000
5    10   200  3000
Kentkenta answered 9/1, 2014 at 14:36 Comment(1)
Thank you unutbu. pd.DataFrame(df.values*df3.values, columns=df.columns, index=df.index) preserves the index as well, right?Sodomite
C
22

A simpler way to do this is just to multiply the dataframe whose colnames you want to keep with the values (i.e. numpy array) of the other, like so:

In [63]: df * df2.values
Out[63]: 
   col1  col2  col3
1    10   200  3000
2    10   200  3000
3    10   200  3000
4    10   200  3000
5    10   200  3000

This way you do not have to write all that new dataframe boilerplate.

Certiorari answered 6/5, 2016 at 6:5 Comment(0)
N
5

To utilize Pandas broadcasting properties, you can use multiply.

df.multiply(df3['col1'], axis=0)
Nationality answered 28/3, 2018 at 21:25 Comment(0)
D
3

This works for me:

mul = df.mul(df3.c, axis=0)

Or, when you want to subtract (divide) instead:

sub = df.sub(df3.c, axis=0)
div = df.div(df3.c, axis=0)

Works also with a nan in df (e.g. if you apply this to the df: df.iloc[0]['col2'] = np.nan)

Dorn answered 22/9, 2017 at 9:16 Comment(1)
This doesn't work. If you meant df.mul(df3.col1, axis=0), please rewrite it.Icarus
R
1

Another way is create list of columns and join them:

cols = [pd.DataFrame(df[col] * df3.col1, columns=[col]) for col in df]
mul = cols[0].join(cols[1:])
Rohde answered 9/1, 2014 at 14:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.