Pandas column bind (cbind) two data frames

Asked 12/10, 2015 at 18:39 Answered 15/2, 2023 at 0:33

Solved python r pandas dataframe concatenation

I've got a dataframe df_a with id information:

    unique_id lacet_number
15    5570613  TLA-0138365
24    5025490  EMP-0138757
36    4354431  DXN-0025343

and another dataframe df_b, with the same number of rows that I know correspond to the rows in df_a:

     latitude  longitude
0  -93.193560  31.217029
1  -93.948082  35.360874
2 -103.131508  37.787609

What I want to do is simply concatenate the two horizontally (similar to cbind in R) and get:

    unique_id lacet_number      latitude  longitude
0     5570613  TLA-0138365    -93.193560  31.217029
1     5025490  EMP-0138757    -93.948082  35.360874
2     4354431  DXN-0025343   -103.131508  37.787609

What I have tried:

df_c = pd.concat([df_a, df_b], axis=1)

which gives me an outer join.

    unique_id lacet_number    latitude  longitude
0         NaN          NaN  -93.193560  31.217029
1         NaN          NaN  -93.948082  35.360874
2         NaN          NaN -103.131508  37.787609
15    5570613  TLA-0138365         NaN        NaN
24    5025490  EMP-0138757         NaN        NaN
36    4354431  DXN-0025343         NaN        NaN

The problem is that the indices for the two dataframes do not match. I read the documentation for pandas.concat, and saw that there is an option ignore_index. But that only applies to the concatenation axis, in my case the columns and it certainly is not the right choice for me. So my question is: is there a simple way to achieve this?

Falconet answered 12/10, 2015 at 18:39 Comment(1)

Should mention that cbind() is an R function that concatenates dataframes and/or series ('vectors'), by column (pd.concat(..., axis=1)). However pandas concat() tries to align indices, whereas R's cbind() ignores them. – Bedfellow 10/11, 2023 at 4:39

139

If you're sure the index row values are the same then to avoid the index alignment order then just call reset_index(), this will reset your index values back to start from 0:

df_c = pd.concat([df_a.reset_index(drop=True), df_b], axis=1)

Indubitability answered 12/10, 2015 at 19:4 Comment(0)

`DataFrame.join`

While concat is fine, it's simpler to join:

C = A.join(B)

This still assumes aligned indexes, so reset_index as needed. In OP's example, B's index is already default, so we only need to reset A:

C = A.reset_index(drop=True).join(B)

#    unique_id  lacet_number     latitude  longitude
# 0    5570613   TLA-0138365   -93.193560  31.217029
# 1    5025490   EMP-0138757   -93.948082  35.360874
# 2    4354431   DXN-0025343  -103.131508  37.787609

Geesey answered 12/12, 2021 at 9:48 Comment(0)

You can use set_axis to make the index labels of one of the frames to be the same as the other's and concatenate horizontally or join. Unlike reset_index, this method preserves the index labels of one of the dataframes.

joined_df = pd.concat([df_a.set_axis(df_b.index), df_b], axis=1)
# or using `join`
joined_df = df_a.set_axis(df_b.index).join(df_b)

Dunston answered 15/2, 2023 at 0:33 Comment(0)

`DataFrame.join`

Recommended topics

Hot tags