Pandas Concat Giving "InvalidIndexError: Reindexing only valid with uniquely valued Index objects" Error
Asked Answered
T

3

6

I have two different dfs that I want to combine using:

pd.concat([df1, df2], 1)

The end result being a df with the date as the index and all of the cols.

According to pandas documentation, this should work. And it was working. But now it's not and I have no idea why.

df1:

        gbp_open  gbp_high gbp_low gbp_close gbp_volume
date                    
2017-03-13  0.8217  0.82246 0.81627 0.8216  000
2017-03-10  0.8224  0.82366 0.82055 0.82255 000
2017-03-09  0.82139 0.82364 0.82    0.8212  000
2017-03-08  0.81943 0.82372 0.8186  0.81937 000
2017-03-07  0.817   0.82163 0.8163  0.8168  000
2017-03-06  0.81351 0.81659 0.8132  0.813   000
2017-03-03  0.8147  0.81854 0.8141  0.81468 000
2017-03-02  0.81492 0.81561 0.81264 0.81485 000
2017-03-01  0.80779 0.81402 0.80629 0.80788 000
2017-02-28  0.80403 0.8059  0.80183 0.8039  000

And df2:

          inr_open  inr_high inr_low inr_close inr_volume
date                    
2017-03-13  66.485  66.58   66.11   66.485  000
2017-03-10  66.71   66.77   66.5398 66.6805 000
2017-03-09  66.815  66.853  66.60   66.765  000
2017-03-08  66.625  66.83   66.613  66.6162 000
2017-03-07  66.645  66.695  66.58   66.6647 000
2017-03-06  66.71   66.78   66.60   66.773  000
2017-03-03  66.845  66.885  66.74   66.8451 000
2017-03-02  66.69   66.858  66.67   66.858  000
2017-03-01  66.705  66.89   66.7046 66.7051 000
2017-02-28  66.735  66.808  66.59   66.6932 000

I've tried several different solutions but none of them do what I need, which is combine the two on the date.

Edit: And strangely enough, I'm using pretty much the exact same code on a different dataset (but same operation) and it is working with no issues.

Edit 2: Maybe this will help. I used df1.join(df2, how = 'outer') and it worked fine. Well almost fine. When I checked for any repeat values, there was one date which showed four (and it happens to be yesterday - which would explain why it's a recent issue).

How might this contribute to the problem?

xdf.index.value_counts()

2017-04-24    4
2016-11-14    1
2011-03-28    1
2011-09-19    1
2011-09-13    1
2013-12-25    1
2012-07-12    1
2011-08-08    1
2016-11-22    1

Any thoughts?

Theona answered 26/4, 2017 at 2:51 Comment(3)
I think you'd better use join for this pandas.pydata.org/pandas-docs/stable/generated/…Bug
Did you try parameter ignore_index = True?Zoara
Well I need ignore_index because I want to consolidate each date's data into one row... And join works, thanks! Still would like to know why the above isn't working - i lifted it directly from the documentation!Theona
G
4

Your edit basically answers the question: since your index has multiple values that are the same, if you want to concatenate based on the index, there is ambiguity as to how to align the indices, so pandas raises the error:

So this works, since the indices are unique:

df1 = pd.DataFrame(index=[0,1,2],columns=['A'],data=[19.,2.,-4.])
df2 = pd.DataFrame(index=[2,1,0],columns=['B'],data=[17.,28.,9.])
df3 = pd.concat(objs=[df1,df2],axis=1)

But the following will raise the same error as you have, since it is not clear which of the two the indices with value "1" from the first dataframe should be aligned with which index with value "1" from the second dataframe:

df3 = pd.DataFrame(index=[1,0,1],columns=['A'],data=[19.,2.,-4.])
df4 = pd.DataFrame(index=[0,1,1],columns=['B'],data=[17.,28.,9.])
df5 = pd.concat(objs=[df3,df4],axis=1)

Trying to execute df5 will give you the InvalidIndexError: Reindexing only valid with uniquely valued Index objects that you find.

Guernsey answered 26/4, 2017 at 3:35 Comment(0)
T
4

Check for duplicate column names. This error might come up if one of the datasets has columns with duplicate names.

Terce answered 23/6, 2022 at 18:53 Comment(2)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Handmedown
If you're using concat with axis=1, make sure there's no duplicate index values as wellCarbonaceous
H
1

Try to print dataframe.index before concatinating for both dataframes. Reset the index using reset_index() if there is a mismatch. And then try to concatenate.

It will work.

Heiduc answered 17/9, 2020 at 18:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.