Cannot set a DataFrame with multiple columns to the single column total_servings
Asked Answered
G

2

9

I am a beginner and getting familiar with pandas . It is throwing an error , When I was trying to create a new column this way :

drinks['total_servings'] = drinks.loc[: ,'beer_servings':'wine_servings'].apply(calculate,axis=1)

Below is my code, and I get the following error for line number 9:

"Cannot set a DataFrame with multiple columns to the single column total_servings"

Any help or suggestion would be appreciated :)

import pandas as pd
drinks = pd.read_csv('drinks.csv')

def calculate(drinks):
    return drinks['beer_servings']+drinks['spirit_servings']+drinks['wine_servings']
print(drinks)
drinks['total_servings'] = drinks.loc[:, 'beer_servings':'wine_servings'].apply(calculate,axis=1)

drinks['beer_sales'] = drinks['beer_servings'].apply(lambda x: x*2)
drinks['spirit_sales'] = drinks['spirit_servings'].apply(lambda x: x*4)
drinks['wine_sales'] = drinks['wine_servings'].apply(lambda x: x*6)
drinks
Goulash answered 19/2, 2023 at 14:14 Comment(0)
M
3

In your code, when functioncalculate is called with axis=1, it passes each row of the Dataframe as an argument. Here, the function calculate is returning dataframe with multiple columns but you are trying to assigned to a single column, which is not possible. You can try updating your code to this,

def calculate(each_row):
    return each_row['beer_servings'] + each_row['spirit_servings'] + each_row['wine_servings']

drinks['total_servings'] = drinks.apply(calculate, axis=1)
drinks['beer_sales'] = drinks['beer_servings'].apply(lambda x: x*2)
drinks['spirit_sales'] = drinks['spirit_servings'].apply(lambda x: x*4)
drinks['wine_sales'] = drinks['wine_servings'].apply(lambda x: x*6)

print(drinks)   
Marozas answered 19/2, 2023 at 14:46 Comment(2)
Appreciate your effort ,thankyou , and I was able to solve it somehow with minor modifications .Goulash
@Goulash what were the modifications you did?Grabowski
I
0

I suppose the reason is the wrong argument name inside calculate method. The given argument is drink but drinks used to calculate sum of columns.

The reason is drink is Series object that represents Row and sum of its elements is scalar. Meanwhile drinks is a DataFrame and sum of its columns will be a Series object

Sample code shows that this method works.

import pandas as pd

df = pd.DataFrame({
    "A":[1,1,1,1,1], 
    "B":[2,2,2,2,2], 
    "C":[3,3,3,3,3]
})

def calculate(to_calc_df):
    return to_calc_df["A"] + to_calc_df["B"] +  to_calc_df["C"]
    
df["total"] = df.loc[:, "A":"C"].apply(calculate, axis=1)

print(df)

Result

   A  B  C  total
0  1  2  3      6
1  1  2  3      6
2  1  2  3      6
3  1  2  3      6
4  1  2  3      6
Interruption answered 19/2, 2023 at 14:33 Comment(1)
I am sorry about the "drinks" in the argument but still it shows the same , but I will try your method, Thanks :)Goulash

© 2022 - 2024 — McMap. All rights reserved.