Why do I get ValueError: Unrecognized data type: x=[...] (of type <class 'list'>) with model.fit() in TensorFlow?
Asked Answered
W

2

5

I tried to run the code below, taken from CS50's AI course:

import csv
import tensorflow as tf
from sklearn.model_selection import train_test_split

# Read data in from file
with open("banknotes.csv") as f:
    reader = csv.reader(f)
    next(reader)

    data = []
    for row in reader:
        data.append(
            {
                "evidence": [float(cell) for cell in row[:4]],
                "label": 1 if row[4] == "0" else 0,
            }
        )

# Separate data into training and testing groups
evidence = [row["evidence"] for row in data]
labels = [row["label"] for row in data]
X_training, X_testing, y_training, y_testing = train_test_split(
    evidence, labels, test_size=0.4
)

# Create a neural network
model = tf.keras.models.Sequential()

# Add a hidden layer with 8 units, with ReLU activation
model.add(tf.keras.layers.Dense(8, input_shape=(4,), activation="relu"))

# Add output layer with 1 unit, with sigmoid activation
model.add(tf.keras.layers.Dense(1, activation="sigmoid"))

# Train neural network
model.compile(
    optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"]
)
model.fit(X_training, y_training, epochs=20)

# Evaluate how well model performs
model.evaluate(X_testing, y_testing, verbose=2)

However, I get the following error:

Traceback (most recent call last):
  File "C:\Users\Eric\Desktop\coding\cs50\ai\lectures\lecture5\banknotes\banknotes.py", line 41, in <module>
    model.fit(X_training, y_training, epochs=20)
  File "C:\Users\Eric\Desktop\coding\cs50\ai\.venv\Lib\site-packages\keras\src\utils\traceback_utils.py", line 122, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\Eric\Desktop\coding\cs50\ai\.venv\Lib\site-packages\keras\src\trainers\data_adapters\__init__.py", line 113, in get_data_adapter
    raise ValueError(f"Unrecognized data type: x={x} (of type {type(x)})")
ValueError: Unrecognized data type: x=[...] (of type <class 'list'>)

where "..." is the training data.

Any idea what went wrong? I'm using Python version 3.11.8 and TensorFlow version 2.16.1 on a Windows computer.

I tried running the same code in a Google Colab notebook, and it works: the problem only occurs on my local machine. This is the output I'm expecting:

Epoch 1/20
26/26 [==============================] - 1s 2ms/step - loss: 1.1008 - accuracy: 0.5055
Epoch 2/20
26/26 [==============================] - 0s 2ms/step - loss: 0.8588 - accuracy: 0.5334
Epoch 3/20
26/26 [==============================] - 0s 2ms/step - loss: 0.6946 - accuracy: 0.5917
Epoch 4/20
26/26 [==============================] - 0s 2ms/step - loss: 0.5970 - accuracy: 0.6683
Epoch 5/20
26/26 [==============================] - 0s 2ms/step - loss: 0.5265 - accuracy: 0.7120
Epoch 6/20
26/26 [==============================] - 0s 2ms/step - loss: 0.4717 - accuracy: 0.7655
Epoch 7/20
26/26 [==============================] - 0s 2ms/step - loss: 0.4258 - accuracy: 0.8177
Epoch 8/20
26/26 [==============================] - 0s 2ms/step - loss: 0.3861 - accuracy: 0.8433
Epoch 9/20
26/26 [==============================] - 0s 2ms/step - loss: 0.3521 - accuracy: 0.8615
Epoch 10/20
26/26 [==============================] - 0s 2ms/step - loss: 0.3226 - accuracy: 0.8870
Epoch 11/20
26/26 [==============================] - 0s 2ms/step - loss: 0.2960 - accuracy: 0.9028
Epoch 12/20
26/26 [==============================] - 0s 2ms/step - loss: 0.2722 - accuracy: 0.9125
Epoch 13/20
26/26 [==============================] - 0s 2ms/step - loss: 0.2506 - accuracy: 0.9283
Epoch 14/20
26/26 [==============================] - 0s 2ms/step - loss: 0.2306 - accuracy: 0.9514
Epoch 15/20
26/26 [==============================] - 0s 3ms/step - loss: 0.2124 - accuracy: 0.9660
Epoch 16/20
26/26 [==============================] - 0s 2ms/step - loss: 0.1961 - accuracy: 0.9769
Epoch 17/20
26/26 [==============================] - 0s 2ms/step - loss: 0.1813 - accuracy: 0.9781
Epoch 18/20
26/26 [==============================] - 0s 2ms/step - loss: 0.1681 - accuracy: 0.9793
Epoch 19/20
26/26 [==============================] - 0s 2ms/step - loss: 0.1562 - accuracy: 0.9793
Epoch 20/20
26/26 [==============================] - 0s 2ms/step - loss: 0.1452 - accuracy: 0.9830
18/18 - 0s - loss: 0.1407 - accuracy: 0.9891 - 187ms/epoch - 10ms/step
[0.14066053926944733, 0.9890710115432739]
Wolgast answered 4/4 at 0:28 Comment(1)
I am thinking that the parameters that you are passing to train_test_split are incorrect, see scikit-learn.org/stable/modules/generated/…Peripteral
B
9

https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit

It appears you're giving Model.fit([X], [y]) the wrong type.

What I almost always do before handing off data to train_test_split is converting my features and labels to np arrays.

So you can either convert them before handing them off to train_test_split or do it before the model.fit(...)

NOTE: Don't forget to add import numpy as np

So in your case you'd do:

X_training_np = np.array(X_training)
y_training_np = np.array(y_training)

model.fit(X_training_np, y_training_np, epochs=...)
Boris answered 4/4 at 4:1 Comment(1)
As a note, I'd also have to do the same with model.evaluate().Wolgast
T
0
import tensorflow as tf
from sklearn.model_selection import train_test_split
import csv
import numpy as np


# Read data in from file
with open("banknotes.csv") as f:
    reader = csv.reader(f)
    next(reader)

    data = []
    for row in reader:
        data.append({
            "evidence": [float(cell) for cell in row[:4]],
            "label": 1 if row[4] == "0" else 0
        })

# Separate data into training and testing groups
evidence = [row["evidence"] for row in data]
labels = [row["label"] for row in data]
# convert to numpy arrays
evidence = np.array(evidence)
labels = np.array(labels)
X_training, X_testing, y_training, y_testing = train_test_split(
    evidence, labels, test_size=0.4
)

# Create a sequential neural network
model = tf.keras.models.Sequential()

# Add a hidden layer with 8 units, with ReLU activation
unit in the previous layer
model.add(tf.keras.Input(shape=(4,)))
model.add(tf.keras.layers.Dense(8, activation="relu"))

# Add output layer with 1 unit, with sigmoid activation
model.add(tf.keras.layers.Dense(1, activation="sigmoid"))

# Train neural network
model.compile(
    optimizer="adam",
    loss="binary_crossentropy",
    metrics=["accuracy"]
)
model.fit(X_training, y_training, epochs=20)

# Evaluate how well model performs
model.evaluate(X_testing, y_testing, verbose=2)
Torchwood answered 13/7 at 21:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.