Keras with tensorflow backend---MemoryError
Asked Answered
L

1

0

I am trying to follow this tutorial to learn a bit about deep learning with keras, however I keep getting MemoryError. Can you please point out what is causing it and how to take care of it?

Here is the code:

import numpy as np
from keras import models, regularizers, layers
from keras.datasets import imdb

(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

def vectorize_sequences(sequences, dimension=10000):
    results = np.zeros((len(sequences), dimension))
    for i, sequence in enumerate(sequences):
        results[i, sequence] = 1.
    return results


x_train = vectorize_sequences(train_data)

Here is the traceback (line number doesn't match the line number from the code mentioned above)

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/home/uttam/pycharm-2018.2.4/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "/home/uttam/pycharm-2018.2.4/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home/uttam/PycharmProjects/IMDB/imdb.py", line 33, in <module>
    x_train = vectorize_sequences(train_data)
  File "/home/uttam/PycharmProjects/IMDB/imdb.py", line 27, in vectorize_sequences
    results = np.zeros((len(sequences), dimension))
MemoryError
Languish answered 7/11, 2018 at 14:9 Comment(4)
"Probably"? Pls include the full error trace - and if you are correct, arguably the largest part of your code is irrelevant to the issue and should be removed.Emma
I have edited the question to add the error traceBauman
So, all the code below x_train = vectorize_sequences(train_data) is irrelevant to the problem (it is never executed) - I am removing it, and keep it in mind for the future...Emma
A related question: #68422910Warehouse
N
1

Yes, you are correct. The problem does arise from vectorize_sequences.

You should do that logic in batches (with slicing data like for partial_x_train) or use generators (here is a good explanation and example).

I hope this helps :)

Ns answered 7/11, 2018 at 14:29 Comment(6)
How do you know that the problem is in vectorize_sequences without the full traceback from the OP?Therm
I've seen that tutorialNs
That doesn't really answer what I asked, you could say its most likely that part, but being 100% sure of it without the additional information can be misleading.Therm
@Ns so I just make it a generator function instead of a normal function by replacing return with yield? It gives me TypeError: 'generator' object is not subscriptable, for the line x_val = x_train[:10000]Bauman
Yes. Generators Are evaluated lazily, SP you cannot slice them. Thes take as much Data as they need.Ns
@Ns but it gave me TypeError: 'generator' object is not subscriptable, for the line x_val = x_train[:10000]. Can you please show me how can I change the function? ThanksBauman

© 2022 - 2024 — McMap. All rights reserved.