Merging rows and summing values by date in Python [duplicate]
Asked Answered
T

1

5

I have a data frame that has multiple duplicated instances, sorted by date. It looks like this:

enter image description here

And I am trying to merge the rows by date for matching "Keywords", and sum the "Views" count.

The result I am looking to obtain is the following:

enter image description here

Could anyone hint me how I could achieve that in Python? Thank you.

The data frame:

 df = pd.DataFrame([["3/8/14", "adapter", 2], ["3/8/14", "adapter", 5], ["3/8/14", "charger", 1],
                   ["13/8/14", "adapter", 1], ["13/8/14", "battery-case", 0]],
                  columns=['Date', 'Keyword', 'Views'])
Twit answered 13/2, 2018 at 17:19 Comment(1)
Simply aggregate -pandas 101: df.groupby(['Date', 'Keyword'])['Views'].sum().reset_index()Albie
R
13

You want a groupby!

import pandas as pd

df = pd.DataFrame([["3/8/14", "adapter", 2], ["3/8/14", "adapter", 5], ["3/8/14", "charger", 1],
                   ["13/8/14", "adapter", 1], ["13/8/14", "battery-case", 0]],
                  columns=['Date', 'Keyword', 'Views'])
print df

#       Date       Keyword  Views
# 0   3/8/14       adapter      2
# 1   3/8/14       adapter      5
# 2   3/8/14       charger      1
# 3  13/8/14       adapter      1
# 4  13/8/14  battery-case      0

df2 = df.groupby(['Date','Keyword'],as_index=False).agg({'Views': 'sum'})

print df2

#       Date       Keyword  Views
# 0  13/8/14       adapter      1
# 1  13/8/14  battery-case      0
# 2   3/8/14       adapter      7
# 3   3/8/14       charger      1
Radiograph answered 13/2, 2018 at 17:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.