jupyter notebook's kernel keeps dying when I run the code
Asked Answered
F

4

11

I made my first steps in deep learning by following this tutorial, and everything was going well until I needed to train the network in jupyter notebook. I tried almost everything and I always get this error

The kernel appears to have died. It will restart automatically.

When I check terminal I can see this

 [I 18:32:24.897 NotebookApp] Adapting to protocol v5.1 for kernel 0d2f57af-46f5-419c-8c8e-9676c14dd9e3
2019-03-09 18:33:12.906756: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-03-09 18:33:12.907661: I tensorflow/core/common_runtime/process_util.cc:69] Creating new thread pool with default inter op setting: 4. Tune using inter_op_parallelism_threads for best performance.
OMP: Error #15: Initializing libiomp5.dylib, but found libiomp5.dylib already initialized.
OMP: Hint: This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
[I 18:33:13.864 NotebookApp] KernelRestarter: restarting kernel (1/5), keep random ports
WARNING:root:kernel 0d2f57af-46f5-419c-8c8e-9676c14dd9e3 restarted

The code that I'm trying to run is fairly simple (even for me who is just starting to get into deep-learning)

import tensorflow as tf  

mnist = tf.keras.datasets.mnist  
(x_train, y_train),(x_test, y_test) = mnist.load_data()  

x_train = tf.keras.utils.normalize(x_train, axis=1)  
x_test = tf.keras.utils.normalize(x_test, axis=1) 

model = tf.keras.models.Sequential()  
model.add(tf.keras.layers.Flatten())  
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))  
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))  
model.add(tf.keras.layers.Dense(10, activation=tf.nn.softmax))  

model.compile(optimizer='adam',  
              loss='sparse_categorical_crossentropy',  
              metrics=['accuracy'])  

model.fit(x_train, y_train, epochs=3)  

val_loss, val_acc = model.evaluate(x_test, y_test)  
print(val_loss)  
print(val_acc)  

I tried out every idea that I had and went through almost all same problems on Google.

Friede answered 10/3, 2019 at 0:43 Comment(4)
It also happened to me. I need to test a model for millions of data. But then in a few minutes, the jupyter notebook stated "DEAD KERNEL".Courson
Install TensorFlow in a separate Anaconda Environment and make sure your libraries are updated.Discretion
Can you reproduce the crash outside of a jupyter notebook? If so you've found a bug and you need to file it on github.Personable
Is that still an issue? I had the same problem and I think it was caused by using conda instead of pip. conda uninstall tensorflow - and then - pip install tensorflow - worked for me.Hannon
R
3

Which version of tensorflow did you download?

It looks like from the error log that there's some OpenMP library issues, I would try reinstalling Tensorflow to the latest stable version.

I had to update my tensorflow (1.13.1) install to get that code working, here's what I output.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Epoch 1/3
60000/60000 [==============================] - 6s 94us/sample - loss: 0.2652 - acc: 0.9213
Epoch 2/3
60000/60000 [==============================] - 6s 95us/sample - loss: 0.1103 - acc: 0.9660
Epoch 3/3
60000/60000 [==============================] - 6s 100us/sample - loss: 0.0735 - acc: 0.9765
10000/10000 [==============================] - 0s 35us/sample - loss: 0.0875 - acc: 0.9731
0.08748154099322855
0.9731

Depending on what library manager you are using, try upgrading

For Pip & Python3:

pip3 install tensorflow --upgrade

For Anaconda:

conda update tensorflow

Then run

import tensorflow as tf
print(tf.__version__)

To verify you have the latest available

Roundtheclock answered 10/3, 2019 at 1:19 Comment(3)
I checked,and I have the latest version,but the problem still persistsFriede
@Friede I was having the same issue and upgrading tensorflow to version 1.13 fixed it for me.Adaadabel
Have you tried setting that environment variable in the error msg in BASH or whatever shell you're using KMP_DUPLICATE_LIB_OK=TRUE Would be a bandaid for now, Look like the jupyter kernel is picking up multiple OpenMP links somehow when it should only see one. Did you install TF in anaconda and through another platform? You may have multiple installed versions and Jupyter is confused which to use.Roundtheclock
A
1

I tried multiple options suggested in various threads - upgrade matplotlib, downgrade matplotlib to 2.x.x version, upgrade TensorFlow to 1.13.1 etc. None worked. For me even a simple dummy plot like the below started failing with "OMP: Error #15" as soon a plot method is encountered after fit method is called in Keras.

acc = [i for i in range(20) ]
epochs = range(1, len(acc) + 1)
loss = range(1, len(acc) + 1)
plt.plot(epochs, loss, 'bo', label='Training loss')

The following as suggested in this Post did the trick for me.

conda install nomkl
Armoured answered 21/7, 2019 at 17:59 Comment(0)
M
1

You can try running python -m notebook in your command prompt(or python3 -m notebook) and try running the code in that kernel. It worked for me

Moureaux answered 30/9, 2021 at 7:20 Comment(0)
U
0

Update your tensorflow package and restart your machine. Also, ensure that you have one kernel activated, then run the code again. That should fix the problem.

To upgrade your tensorflow using pip, use the command below

pip install tensorflow --upgrade

For pip3, use

pip3 install tensorflow --upgrade

For conda, use

conda update tensorflow

Unreserved answered 9/3, 2020 at 22:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.