Compare column names of Pandas Dataframe
Asked Answered
R

2

16

How to compare column names of 2 different Pandas data frame. I want to compare train and test data frames where there are some columns missing in test Data frames??

Rump answered 6/5, 2018 at 19:31 Comment(4)
To look at column names, use df.columns. Why would your test and train sets have different columns though?Krysta
You need to compare independent variables ?Torrid
A man has a question.Lozier
It's too big and after pd.get_dummies column count are not matching.Rump
P
38

pandas.Index objects, including dataframe columns, have useful set-like methods, such as intersection and difference.

For example, given dataframes train and test:

train_cols = train.columns
test_cols = test.columns

common_cols = train_cols.intersection(test_cols)
train_not_test = train_cols.difference(test_cols)
Psi answered 6/5, 2018 at 19:37 Comment(1)
Use the align methodSkepful
T
-2
train_column = train.columns
test_column = test.columns

common_column = train_column.intersection(test_column)
train_not_in_test = train_column.difference(test_column)
Togoland answered 2/8, 2021 at 9:5 Comment(2)
Hi Nath. Please make more obvious the functional difference to the solution proposed in the highly upvoted answer by jpp. Explain what benedit your solution offers in comparisopn. Currently this gives an impression of being functionally identical to an upvoted answer with no explanation. Because of this impression it risks being downvoted.Salver
While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.Intoxicated

© 2022 - 2024 — McMap. All rights reserved.