Convenient way to deal with ValueError: cannot reindex from a duplicate axis
Asked Answered
C

1

8

I am able to search suggestions that show the 'cause' of this error message, but not how to address it -

I encounter this problem every time I try to add a new column to a pandas dataframe by concatenating string values in 2 existing columns.

For instance:

wind['timestamp'] = wind['DATE (MM/DD/YYYY)'] + ' ' + temp['stamp']

It works if the first item and the second merged with ' ' are each separate dataframe/series.

These attempts are to have date & time merged into the same column so that they get recognized as datetime stamps by pandas library.

I am not certain if I am wrongly using the command or if it is the pandas library features are internally limited, as it keeps returning the duplicate axis error msg. I understand the latter is highly unlikely hahaha ...

Could I hear some quick and easy solution out of this?

I mean, I thought sum/subtract and all these operations between column values in a dataframe would be quite easy. Shouldn't be too hard to have it visible on the table either right?

Conjurer answered 21/8, 2018 at 17:27 Comment(1)
I strongly suggest checking this answer https://mcmap.net/q/80154/-remove-pandas-rows-with-duplicate-indicesSharpie
C
18

Operations between series require non-duplicated indices, otherwise Pandas doesn't know how to align values in calculations. This isn't the case with your data currently.

If you are certain that your series are aligned by position, you can call reset_index on each dataframe:

wind = pd.DataFrame({'DATE (MM/DD/YYYY)': ['2018-01-01', '2018-02-01', '2018-03-01']})
temp = pd.DataFrame({'stamp': ['1', '2', '3']}, index=[0, 1, 1])

# ATTEMPT 1: FAIL
wind['timestamp'] = wind['DATE (MM/DD/YYYY)'] + ' ' + temp['stamp']
# ValueError: cannot reindex from a duplicate axis

# ATTEMPT 2: SUCCESS
wind = wind.reset_index(drop=True)
temp = temp.reset_index(drop=True)
wind['timestamp'] = wind['DATE (MM/DD/YYYY)'] + ' ' + temp['stamp']

print(wind)

  DATE (MM/DD/YYYY)     timestamp
0        2018-01-01  2018-01-01 1
1        2018-02-01  2018-02-01 2
2        2018-03-01  2018-03-01 3
Carrasquillo answered 21/8, 2018 at 17:32 Comment(1)
oh man you are awesome. I realize I didnt reset_index the temp series. better check the last file too. thx!Conjurer

© 2022 - 2024 — McMap. All rights reserved.