AttributeError: 'DataFrame' object has no attribute [duplicate]
Asked Answered
M

6

45

I keep getting different attribute errors when trying to run this file in ipython...beginner with pandas so maybe I'm missing something

Code:

from pandas import Series, DataFrame

import pandas as pd

import json

nan=float('NaN')
data = []
with open('file.json') as f:
for line in f:
    data.append(json.loads(line))

df = DataFrame(data, columns=['accepted', 'user', 'object', 'response'])
clean = df.replace('NULL', nan)
clean = clean.dropna()

print clean.value_counts() 

AttributeError: 'DataFrame' object has no attribute 'value_counts'

Any ideas?

Malmsey answered 15/10, 2013 at 22:32 Comment(1)
Gotcha for future searchers: if you select a duplicated column name, you'll get a dataframe rather than a series!Givens
F
59

value_counts is a Series method rather than a DataFrame method (and you are trying to use it on a DataFrame, clean). You need to perform this on a specific column:

clean[column_name].value_counts()

It doesn't usually make sense to perform value_counts on a DataFrame, though I suppose you could apply it to every entry by flattening the underlying values array:

pd.value_counts(df.values.flatten())
Facient answered 16/10, 2013 at 0:29 Comment(2)
it is not working when I am getting column through ilocHound
value_counts() is now a DataFrame method since pandas 1.1.0 -- posted another answer https://mcmap.net/q/367715/-attributeerror-39-dataframe-39-object-has-no-attribute-duplicateJurel
M
15

To get all the counts for all the columns in a dataframe, it's just df.count()

Mccarley answered 29/4, 2015 at 0:2 Comment(1)
df.count() produces a different result than df['col'].value_counts() aka series.value_counts()! But , your post is probably helpful for folks who want df.count()Adrenal
J
11

value_counts() is now a DataFrame method since pandas 1.1.0

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.value_counts.html

Jurel answered 25/11, 2020 at 12:0 Comment(0)
B
1

I had the same problem, it was working but now for some reason it is not. I replaced it with a groupby:

grouped = pd.DataFrame(data.groupby(['col1','col2'])['col2'].count())
grouped.columns = ['Value_counts']
grouped
Buoyage answered 18/8, 2021 at 12:53 Comment(0)
V
0

value_counts work only for series. It won't work for entire DataFrame. Try selecting only one column and using this attribute. For example:

df['accepted'].value_counts()

It also won't work if you have duplicate columns. This is because when you select a particular column, it will also represent the duplicate column and will return dataframe instead of series. At that time remove duplicate column by using

df = df.loc[:,~df.columns.duplicated()]
df['accepted'].value_counts()
Virescence answered 21/5, 2020 at 12:40 Comment(0)
F
0

If you are using groupby(), just create a new variable to store data.groupby('column_name') then after take that variable and access that column again by applying value_counts(). Like df=data.groupby('city'), after you may say df['city'].value_counts(). This worked for me

Fractionize answered 25/2, 2022 at 0:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.