Arff Loader : AttributeError: 'dict' object has no attribute 'data'
Asked Answered
A

2

17

I am trying to load a .arff file into a numpy array using liac-arff library. (https://github.com/renatopp/liac-arff)

This is my code.

import arff, numpy as np
dataset = arff.load(open('mydataset.arff', 'rb'))
data = np.array(dataset.data)

when executing, I am getting the error.

ArffLoader.py", line 8, in <module>
data = np.array(dataset.data)
AttributeError: 'dict' object has no attribute 'data'

I have seen similar threads, Smartsheet Data Tracker: AttributeError: 'dict' object has no attribute 'append'. I am new to Python and is not able to resolve this issue. How can I fix this?

Angioma answered 10/3, 2015 at 14:31 Comment(0)
C
22

Short version

dataset is a dict. For a dict, you access the values using the python indexing notation, dataset[key], where key could be a string, integer, float, tuple, or any other immutable data type (it is a bit more complicated than that, more below if you are interested).

In your case, the key is in the form of a string. To access it, you need to give the string you want as an index, like so:

import arff
import numpy as np
dataset = arff.load(open('mydataset.arff', 'rb'))
data = np.array(dataset['data'])

(you also shouldn't put the imports on the same line, although this is just a readability issue)

More detailed explanation

dataset is a dict, which on some languages is called a map or hashtable. In a dict, you access values in a similar way to how you index in a list or array, except the "index" can be any data-type that is "hashable" (which is, ideally, unique identifier for each possible value). This "index" is called a "key". In practice, at least for built-in types and most major packages, only immutable data types or hashable, but there is no actual rule that requires this to be the case.

Do you come from MATLAB? If so, then you are probably trying to use MATLAB's struct access technique. You could think of a dict as a much faster, more flexible struct, but syntax for accessing values are is different.

Camelliacamelopard answered 10/3, 2015 at 15:8 Comment(2)
Thanks, this is working. My programming background is mainly Java and I have just started in Python. The code, I used is from here. #27264926.Angioma
@Angioma just edited my answer in the refered question. Thanks!Galen
S
1

Its easy to load arff data into python using scipy.

from scipy.io import arff

import pandas as pd

data = arff.loadarff('dataset.arff')

df = pd.DataFrame(data[0])

df.head()
Santinasantini answered 23/8, 2018 at 15:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.