Short version
dataset
is a dict
. For a dict
, you access the values using the python indexing notation, dataset[key]
, where key
could be a string, integer, float, tuple, or any other immutable data type (it is a bit more complicated than that, more below if you are interested).
In your case, the key is in the form of a string. To access it, you need to give the string you want as an index, like so:
import arff
import numpy as np
dataset = arff.load(open('mydataset.arff', 'rb'))
data = np.array(dataset['data'])
(you also shouldn't put the imports on the same line, although this is just a readability issue)
More detailed explanation
dataset
is a dict
, which on some languages is called a map
or hashtable
. In a dict
, you access values in a similar way to how you index in a list or array, except the "index" can be any data-type that is "hashable" (which is, ideally, unique identifier for each possible value). This "index" is called a "key". In practice, at least for built-in types and most major packages, only immutable data types or hashable, but there is no actual rule that requires this to be the case.
Do you come from MATLAB
? If so, then you are probably trying to use MATLAB's
struct
access technique. You could think of a dict
as a much faster, more flexible struct
, but syntax for accessing values are is different.