Rename specific column(s) in pandas
Asked Answered
H

7

301

I've got a dataframe called data. How would I rename the only one column header? For example gdp to log(gdp)?

data =
    y  gdp  cap
0   1    2    5
1   2    3    9
2   8    7    2
3   3    4    7
4   6    7    7
5   4    8    3
6   8    2    8
7   9    9   10
8   6    6    4
9  10   10    7
Headliner answered 3/11, 2013 at 21:28 Comment(1)
Does this answer your question? Renaming columns in PandasRage
C
619
data.rename(columns={'gdp':'log(gdp)'}, inplace=True)

The rename show that it accepts a dict as a param for columns so you just pass a dict with a single entry.

Also see related

Cryan answered 3/11, 2013 at 21:32 Comment(1)
that was surprising. It's a fine way to do it but I had not seen too many Pandas api's using dict's as valuesKaralee
Q
48

A much faster implementation would be to use list-comprehension if you need to rename a single column.

df.columns = ['log(gdp)' if x=='gdp' else x for x in df.columns]

If the need arises to rename multiple columns, either use conditional expressions like:

df.columns = ['log(gdp)' if x=='gdp' else 'cap_mod' if x=='cap' else x for x in df.columns]

Or, construct a mapping using a dictionary and perform the list-comprehension with it's get operation by setting default value as the old name:

col_dict = {'gdp': 'log(gdp)', 'cap': 'cap_mod'}   ## key→old name, value→new name

df.columns = [col_dict.get(x, x) for x in df.columns]

Timings:

%%timeit
df.rename(columns={'gdp':'log(gdp)'}, inplace=True)
10000 loops, best of 3: 168 µs per loop

%%timeit
df.columns = ['log(gdp)' if x=='gdp' else x for x in df.columns]
10000 loops, best of 3: 58.5 µs per loop
Quash answered 23/9, 2016 at 9:17 Comment(3)
It's kind of odd that this is faster. I'm assuming that you're using the DataFrame from the question which only has 3 columns (y, gdp, and cap). If you use a much larger number of columns, is the list-comprehension version still about 3 times faster than the using rename? or does one version become much faster than the other?Ansilma
@LelandHepworth I find it doubtful the Pandas authors were concerned over the number of microseconds it took to rename a column.Modify
A much faster than what ?Brenan
F
31

How do I rename a specific column in pandas?

From v0.24+, to rename one (or more) columns at a time,

If you need to rename ALL columns at once,

  • DataFrame.set_axis() method with axis=1. Pass a list-like sequence. Options are available for in-place modification as well.

rename with axis=1

df = pd.DataFrame('x', columns=['y', 'gdp', 'cap'], index=range(5))
df

   y gdp cap
0  x   x   x
1  x   x   x
2  x   x   x
3  x   x   x
4  x   x   x

With 0.21+, you can now specify an axis parameter with rename:

df.rename({'gdp':'log(gdp)'}, axis=1)
# df.rename({'gdp':'log(gdp)'}, axis='columns')
    
   y log(gdp) cap
0  x        x   x
1  x        x   x
2  x        x   x
3  x        x   x
4  x        x   x

(Note that rename is not in-place by default, so you will need to assign the result back.)

This addition has been made to improve consistency with the rest of the API. The new axis argument is analogous to the columns parameter—they do the same thing.

df.rename(columns={'gdp': 'log(gdp)'})

   y log(gdp) cap
0  x        x   x
1  x        x   x
2  x        x   x
3  x        x   x
4  x        x   x

rename also accepts a callback that is called once for each column.

df.rename(lambda x: x[0], axis=1)
# df.rename(lambda x: x[0], axis='columns')

   y  g  c
0  x  x  x
1  x  x  x
2  x  x  x
3  x  x  x
4  x  x  x

For this specific scenario, you would want to use

df.rename(lambda x: 'log(gdp)' if x == 'gdp' else x, axis=1)

Index.str.replace

Similar to replace method of strings in python, pandas Index and Series (object dtype only) define a ("vectorized") str.replace method for string and regex-based replacement.

df.columns = df.columns.str.replace('gdp', 'log(gdp)')
df
 
   y log(gdp) cap
0  x        x   x
1  x        x   x
2  x        x   x
3  x        x   x
4  x        x   x

The advantage of this over the other methods is that str.replace supports regex (enabled by default). See the docs for more information.


Passing a list to set_axis with axis=1

Call set_axis with a list of header(s). The list must be equal in length to the columns/index size. set_axis mutates the original DataFrame by default, but you can specify inplace=False to return a modified copy.

df.set_axis(['cap', 'log(gdp)', 'y'], axis=1, inplace=False)
# df.set_axis(['cap', 'log(gdp)', 'y'], axis='columns', inplace=False)

  cap log(gdp)  y
0   x        x  x
1   x        x  x
2   x        x  x
3   x        x  x
4   x        x  x

Note: In future releases, inplace will default to True.

Method Chaining
Why choose set_axis when we already have an efficient way of assigning columns with df.columns = ...? As shown by Ted Petrou in this answer set_axis is useful when trying to chain methods.

Compare

# new for pandas 0.21+
df.some_method1()
  .some_method2()
  .set_axis()
  .some_method3()

Versus

# old way
df1 = df.some_method1()
        .some_method2()
df1.columns = columns
df1.some_method3()

The former is more natural and free flowing syntax.

Farrier answered 11/9, 2017 at 0:9 Comment(1)
You could also add inplace=True to make the .replace method place-in. Then you do not need to assign it back.Wakayama
R
9

There are at least five different ways to rename specific columns in pandas, and I have listed them below along with links to the original answers. I also timed these methods and found them to perform about the same (though YMMV depending on your data set and scenario). The test case below is to rename columns A M N Z to A2 M2 N2 Z2 in a dataframe with columns A to Z containing a million rows.

# Import required modules
import numpy as np
import pandas as pd
import timeit

# Create sample data
df = pd.DataFrame(np.random.randint(0,9999,size=(1000000, 26)), columns=list('ABCDEFGHIJKLMNOPQRSTUVWXYZ'))

# Standard way - https://mcmap.net/q/99692/-rename-specific-column-s-in-pandas
def method_1():
    df_renamed = df.rename(columns={'A': 'A2', 'M': 'M2', 'N': 'N2', 'Z': 'Z2'})

# Lambda function - https://mcmap.net/q/101759/-how-can-i-change-name-of-arbitrary-columns-in-pandas-df-using-lambda-function
def method_2():
    df_renamed = df.rename(columns=lambda x: x + '2' if x in ['A', 'M', 'N', 'Z'] else x)

# Mapping function - https://mcmap.net/q/99692/-rename-specific-column-s-in-pandas
def rename_some(x):
    if x=='A' or x=='M' or x=='N' or x=='Z':
        return x + '2'
    return x
def method_3():
    df_renamed = df.rename(columns=rename_some)

# Dictionary comprehension - https://mcmap.net/q/101760/-rename-multiple-columns-of-pandas-dataframe-based-on-condition
def method_4():
    df_renamed = df.rename(columns={col: col + '2' for col in df.columns[
        np.asarray([i for i, col in enumerate(df.columns) if 'A' in col or 'M' in col or 'N' in col or 'Z' in col])
    ]})

# Dictionary comprehension - https://mcmap.net/q/101761/-changing-multiple-column-names-but-not-all-of-them-pandas-python
def method_5():
    df_renamed = df.rename(columns=dict(zip(df[['A', 'M', 'N', 'Z']], ['A2', 'M2', 'N2', 'Z2'])))

print('Method 1:', timeit.timeit(method_1, number=10))
print('Method 2:', timeit.timeit(method_2, number=10))
print('Method 3:', timeit.timeit(method_3, number=10))
print('Method 4:', timeit.timeit(method_4, number=10))
print('Method 5:', timeit.timeit(method_5, number=10))

Output:

Method 1: 3.650640267
Method 2: 3.163998427
Method 3: 2.998530871
Method 4: 2.9918436889999995
Method 5: 3.2436501520000007

Use the method that is most intuitive to you and easiest for you to implement in your application.

Renege answered 19/4, 2020 at 22:52 Comment(0)
P
5

Use the pandas.DataFrame.rename funtion. Check this link for description.

data.rename(columns = {'gdp': 'log(gdp)'}, inplace = True)

If you intend to rename multiple columns then

data.rename(columns = {'gdp': 'log(gdp)', 'cap': 'log(cap)', ..}, inplace = True)
Peterus answered 5/9, 2020 at 12:8 Comment(1)
repeats previous answersSpinode
E
0
df.rename(columns=lambda x: {"My_sample": "My_sample_new_name"}.get(x, x))
Erland answered 23/9, 2021 at 11:13 Comment(2)
brilliant. thank you. I ended up using it like this - Works great! df = df.rename(columns=lambda x: {"My_sample": "My_sample_new_name"}.get(x, x))Barb
@Barb i have got some experience in pandas now and i want to tell, that "df.rename(columns={"My_sample": "My_sample_new_name"})" will do the same. :)Erland
C
-1

ewe can rename by re—doing the table

df = pd.DataFrame()
column_names = mydataframe.columns
for i in range(len(mydataframe)):
  column = mydataframe.iloc[:,i]
  df[column_names[i][:-8]+"desigred_texnt"] = column
print(df.columns)
Chinese answered 6/12, 2021 at 20:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.