Can't access dataframe columns
Asked Answered
A

3

10

I'm importing a dataframe from a csv file, but cannot access some of it's columns by name. What's going on?

In more concrete terms:

> import pandas

> jobNames = pandas.read_csv("job_names.csv")
> print(jobNames)

   job_id   job_name   num_judgements
0  933985        Foo              180
1  933130        Moo              175
2  933123        Goo              150
3  933094       Flue              120
4  933088        Tru              120

When I try to access the second column, I get an error:

> jobNames.job_name

AttributeError: 'DataFrame' object has no attribute 'job_name'

Strangely, I can access the job_id column thus:

> print(jobNames.job_id)

0    933985
1    933130
2    933123
3    933094
4    933088
Name: job_id, dtype: int64

Edit (to put the accepted answer in context):

It turns out that the first row of the csv file (with the column names) looks like this:

job_id, job_name, num_judgements

Note the spaces after each comma! Those spaces are retained in the column names:

> jobNames.columns[1]

' job_name'

which don't form valid python identifiers, so those columns aren't available as dataframe attributes. I can still access them dict-style:

> jobNames[' job_name']
Archer answered 11/8, 2016 at 10:41 Comment(0)
A
12

When using pandas.read_csv pass in skipinitialspace=True flag to remove whitespace after CSV delimiters.

Austronesia answered 11/8, 2016 at 10:45 Comment(0)
M
4

Another solution for removing whitespaces from column names is str.strip:

jobNames.columns = jobNames.columns.str.strip()
print (jobNames.job_name)

0     Foo
1     Moo
2     Goo
3    Flue
4     Tru
Moralez answered 11/8, 2016 at 10:44 Comment(0)
A
0

Another (perhaps inferior) approach is to remove the spaces from the column names:

> jobNames.columns = map(lambda s:s.strip(), jobNames.columns)
> jobNames.job_name

0   Foo
1   Moo
2   Goo
3   Flue
4   Tru
Name: job_name, dtype: object    
Archer answered 11/8, 2016 at 10:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.