How can I change this function to make it more efficient? I keep getting MemoryError
def vectorize_sequences(sequences, dimension=10000):
results = np.zeros((len(sequences), dimension))
for i, sequence in enumerate(sequences):
results[i, sequence] = 1.
return results
I call the function here:
x_train = vectorize_sequences(train_data)
x_test = vectorize_sequences(test_data)
Train and Test data are IMDB dataset for sentiment analysis, i.e.
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)
EDIT: I am running this on 64 bit Ubuntu system with 4 GB RAM.
Here is the Traceback:
Traceback (most recent call last):
File "/home/uttam/PycharmProjects/IMDB/imdb.py", line 29, in <module>
x_test = vectorize_sequences(test_data)
File "/home/uttam/PycharmProjects/IMDB/imdb.py", line 20, in vectorize_sequences
results = np.zeros((len(sequences), dimension))
MemoryError