replace index values in pandas dataframe with values from list
Asked Answered
A

4

16

I have a dataframe and 2 lists.

the 1st list gives a set of index values from the dataframe I want to replace

the 2nd list gives the values I want to use

I don't want to touch any of the other values

Here is the dataframe:

df =  pd.DataFrame.from_dict({u'Afghanistan': 6532.0,
 u'Albania': 662.0,
 u'Andorra': 2.0,
 u'Angola': 2219.0,
 u'Antigua and Barbuda': 0.0,
 u'Argentina': 6.0,
 u'Armenia': 15.0,
 u'Australia': 108.0,
 u'Azerbaijan': 210.0,
 u'Bahamas': 0.0,
 u'Bahrain': 6.0,
 u'Bangladesh': 5098.0,
 u'Barbados': 0.0,
 u'Belarus': 21.0,
 u'Belize': 0.0,
 u'Benin': 4244.0,
 u'Bhutan': 418.0,
 u'Bolivia (Plurinational State of)': 122.0,
 u'Bosnia and Herzegovina': 43.0,
 u'Botswana': 2672.0,
 u'Brazil': 36.0,
 u'Brunei Darussalam': 42.0,
 u'Bulgaria': 46.0,
 u'Burkina Faso': 6074.0,
 u'Burundi': 18363.0,
 u'Cabo Verde': 2.0,
 u'Cambodia': 12237.0,
 u'Cameroon': 14629.0,
 u'Canada': 206.0,
 u'Central African Republic': 3207.0,
 u'Chad': 3546.0,
 u'Chile': 0.0,
 u'China': 71093.0,
 u'Colombia': 1.0,
 u'Congo': 1678.0,
 u'Cook Islands': 2.0,
 u'Costa Rica': 0.0,
 u'Croatia': 9.0,
 u'Cuba': 0.0,
 u'Cyprus': 0.0,
 u'Czechia': 9.0,
 u"C\xf4te d'Ivoire": 5729.0,
 u'Democratic Republic of the Congo': 8282.0,
 u'Denmark': 14.0,
 u'Djibouti': 183.0,
 u'Dominica': 0.0,
 u'Dominican Republic': 253.0,
 u'Ecuador': 0.0,
 u'Egypt': 2633.0,
 u'El Salvador': 0.0,
 u'Eritrea': 789.0,
 u'Estonia': 9.0,
 u'Ethiopia': 1660.0,
 u'France': 10000.0,
 u'Gabon': 15.0,
 u'Gambia': 336.0,
 u'Georgia': 50.0,
 u'Ghana': 23068.0,
 u'Greece': 56.0,
 u'Grenada': 0.0,
 u'Guatemala': 0.0,
 u'Guinea': 11294.0,
 u'Guyana': 0.0,
 u'Haiti': 992.0,
 u'Honduras': 0.0,
 u'Hungary': 1.0,
 u'Iceland': 0.0,
 u'India': 38835.0,
 u'Indonesia': 3344.0,
 u'Iran (Islamic Republic of)': 11874.0,
 u'Iraq': 726.0,
 u'Israel': 36.0,
 u'Italy': 1457.0,
 u'Jamaica': 0.0,
 u'Japan': 22497.0,
 u'Jordan': 32.0,
 u'Kazakhstan': 245.0,
 u'Kenya': 21002.0,
 u'Kiribati': 0.0,
 u'Kuwait': 6.0,
 u'Kyrgyzstan': 16.0,
 u"Lao People's Democratic Republic": 332.0,
 u'Latvia': 0.0,
 u'Lebanon': 5.0,
 u'Lesotho': 660.0,
 u'Liberia': 5977.0,
 u'Lithuania': 19.0,
 u'Luxembourg': 0.0,
 u'Madagascar': 35256.0,
 u'Malawi': 304.0,
 u'Malaysia': 6187.0,
 u'Maldives': 20.0,
 u'Mali': 1578.0,
 u'Malta': 2.0,
 u'Marshall Islands': 0.0,
 u'Mauritius': 0.0,
 u'Mexico': 30.0,
 u'Micronesia (Federated States of)': 0.0,
 u'Mongolia': 925.0,
 u'Morocco': 7368.0,
 u'Mozambique': 7375.0,
 u'Myanmar': 845.0,
 u'Namibia': 469.0,
 u'Nauru': 0.0,
 u'Nepal': 9397.0,
 u'Netherlands': 1019.0,
 u'New Zealand': 65.0,
 u'Nicaragua': 0.0,
 u'Niger': 21319.0,
 u'Nigeria': 212183.0,
 u'Niue': 0.0,
 u'Norway': 0.0,
 u'Oman': 15.0,
 u'Pakistan': 2064.0,
 u'Palau': 0.0,
 u'Panama': 0.0,
 u'Papua New Guinea': 7135.0,
 u'Paraguay': 0.0,
 u'Peru': 1.0,
 u'Philippines': 7120.0,
 u'Poland': 77.0,
 u'Portugal': 45.0,
 u'Qatar': 46.0,
 u'Republic of Korea': 32647.0,
 u'Republic of Moldova': 687.0,
 u'Romania': 35.0,
 u'Russian Federation': 4800.0,
 u'Rwanda': 2095.0,
 u'Saint Kitts and Nevis': 0.0,
 u'Saint Lucia': 0.0,
 u'Saint Vincent and the Grenadines': 0.0,
 u'San Marino': 1.0,
 u'Sao Tome and Principe': 0.0,
 u'Senegal': 5839.0,
 u'Serbia': 38.0,
 u'Sierra Leone': 3575.0,
 u'Singapore': 141.0,
 u'Slovakia': 0.0,
 u'Somalia': 3965.0,
 u'South Africa': 1459.0,
 u'Spain': 152.0,
 u'Sri Lanka': 16527.0,
 u'Sudan': 2875.0,
 u'Suriname': 0.0,
 u'Swaziland': 10.0,
 u'Sweden': 59.0,
 u'Syrian Arab Republic': 146.0,
 u'Tajikistan': 192.0,
 u'Thailand': 4074.0,
 u'The former Yugoslav republic of Macedonia': 36.0,
 u'Togo': 3578.0,
 u'Tonga': 0.0,
 u'Trinidad and Tobago': 0.0,
 u'Tunisia': 47.0,
 u'Turkey': 16244.0,
 u'Turkmenistan': 113.0,
 u'Uganda': 42554.0,
 u'Ukraine': 817.0,
 u'United Arab Emirates': 69.0,
 u'United Kingdom of Great Britain and Northern Ireland': 104.0,
 u'United Republic of Tanzania': 14649.0,
 u'United States of America': 85.0,
 u'Uruguay': 0.0,
 u'Uzbekistan': 80.0,
 u'Vanuatu': 9.0,
 u'Venezuela (Bolivarian Republic of)': 22.0,
 u'Viet Nam': 16512.0,
 u'Zambia': 30930.0,
 u'Zimbabwe': 1483.0}, orient = 'index')

Here is the 1st list:

list1 = [u'Bolivia (Plurinational State of)', u'Brunei Darussalam', u'Cabo Verde', u'China',
    u'Congo', u'Cook Islands', u'Czechia', u"C\xf4te d'Ivoire", 
    u"Democratic People's Republic of Korea", u'France', u'Iran (Islamic Republic of)', 
    u"Lao People's Democratic Republic", u'Micronesia (Federated States of)', u'Niue', 
    u'Republic of Korea', u'Republic of Moldova', u'Russian Federation', u'Sao Tome and Principe', 
    u'Serbia', u'Somalia', u'Syrian Arab Republic', u'The former Yugoslav republic of Macedonia', 
    u'United Kingdom of Great Britain and Northern Ireland', u'United Republic of Tanzania', 
    u'United States of America', u'Venezuela (Bolivarian Republic of)', u'Viet Nam']

Here is the 2nd list

list2 = [u'Bolivia', u'Brunei', u'Cape Verde', u'China[1]', u'Democratic Republic of the Congo', 
    u'Cook Islands (NZ)', u'Czech Republic', u'Ivory Coast', u'North Korea', u'France[2]', 
    u'Iran', u'Laos', u'Federated States of Micronesia', u'Niue (NZ)', u'South Korea', 
    u'Moldova[3]', u'Russia', u'S\xe3o Tom\xe9 and Pr\xedncipe', u'Serbia[5]', 
    u'Somalia[6]', u'Syria', u'Macedonia', u'United Kingdom', u'Tanzania', 
    u'United States', u'Venezuela', u'Vietnam']

This is clearly the sort of thing python excels at - and I suspect a simple for loop will do it but I can't quite wrap my head around the logic (yet)

Any help gratefully appreciated!

Andonis answered 21/4, 2018 at 2:18 Comment(3)
Not sure what has to be replaced where?Caucus
You could try to use the replace function in pandas. #27060598Spodumene
In the dataframe, some of the index values are not what I want. The 1st list identifies which index values I want to change, the second list identifies the values I want to change them to. Same number of items in each list - and their positions match.Andonis
G
25

Use,

df = df.rename(index=dict(zip(list1,list2)))
Generable answered 21/4, 2018 at 2:28 Comment(1)
Just brilliant! Exactly what I wanted to do - and no looping involved at all! So Easy when you know how! THANK YOUAndonis
W
8

zip the two lists to create a dictionary that maps old names to the new names.

use the function pandas.DataFrame.rename with with the replacements dictionary and all other default arguments

replacements = {l1:l2 for l1, l2 in zip(list1, list2)}

df2 = df.rename(replacements)
Whitecap answered 21/4, 2018 at 2:27 Comment(0)
D
1

I believe there's an easier way now: pandas.DataFrame.set_index()

Usage:

df.set_index(list1)

OR

# Use this if you wanna assing one of the existing DataFrame columns as Index
df.set_index(df_column_id)
Dungaree answered 21/1, 2023 at 8:49 Comment(0)
O
1

If the new labels are in a list:

  • convert the list into an array
  • use df.set_index(array)

If the new labels are in a column:

  • use df.set_index(column_label)

If the index is a MultiIndex, use respectively a 2D array and a list of labels.

import pandas as pd
import numpy as np
df = pd.DataFrame([[1, 2], [3, 4], [5, 6]],
                  index=list('abc'),
                  columns=list('AB')
new_labels = np.array(list('uvw'))
df = df.set_index(new_labels)

   A  B
a  1  2
b  3  4
c  5  6

   A  B
u  1  2
v  3  4
w  5  6

If you want to replace labels according to some correspondence:

  • prepare a mapping, e.g. a dictionary which keys are the old labels, and values the corresponding new labels ({'a': 'u', 'b': 'v', 'c': 'w'})
  • use df.rename(mapping)
Olly answered 15/5, 2024 at 13:31 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.