I have a dataset like this
my_data= [['Manchester', '23', '80', 'CM',
'Manchester', '22', '79', 'RM',
'Manchester', '19', '76', 'LB'],
['Benfica', '26', '77', 'CF',
'Benfica', '22', '74', 'CDM',
'Benfica', '17', '70', 'RB'],
['Dortmund', '24', '75', 'CM',
'Dortmund', '18', '74', 'AM',
'Dortmund', '16', '69', 'LM']
]
I know that using train_test_split from sklearn.cross_validation, and I've tried with this
from sklearn.model_selection import train_test_split
train, test = train_test_split(my_data, test_size = 0.2)
The result just split into test and train. I wish to divide it to 3 separate sets with randomized data.
Expected: Test, Train, Valid
train_test_split
divides your data into train and validation set. Don't get confused by the names.Test
data should be where you don't know your output variable. – Ramadan