I am using google colab on a dataset with 4 million rows and 29 columns. When I run the statement sns.heatmap(dataset.isnull()) it runs for some time but after a while the session crashes and the instance restarts. It has been happening a lot and I till now haven't really seen an output. What can be the possible reason ? Is the data/calculation too much ? What can I do ?
I'm not sure what is causing your specific crash, but a common cause is an out-of-memory error. It sounds like you're working with a large enough dataset that this is probable. You might try working with a subset of the dataset and see if the error recurs.
Otherwise, CoLab keeps logs in /var/log/colab-jupyter.log
. You may be able to get more insight into what is going on by printing its contents. Either run:
!cat /var/log/colab-jupyter.log
Or, to get the messages alone (easier to read):
import json
with open("/var/log/colab-jupyter.log", "r") as fo:
for line in fo:
print(json.loads(line)['msg'])
/var/log
folder: alternatives.log bootstrap.log dpkg.log fontconfig.log lastlog pip.log.bak-run.sh wtmp apt btmp faillog journal pip.log private
–
Powdery Another cause - if you're using PyTorch and assign your model to the GPU, but don't assign an internal tensor to the GPU (e.g. a hidden layer).
This error mostly comes if you enable the GPU but do not using it. Change your runtime type to "None". You will not face this issue again.
I would first suggest closing your browser and restarting the notebook. Look at the run time logs and check to see if cuda is mentioned anywhere. If not then do a factory runtime reset and run the notebook. Check your logs again and you should find cuda somewhere there.
For me, passing specific arguments to the tfms augmentation failed the dataloader and crahed the session. Wasted lot of time checking the images not coruppt and clean the gc and more...
What worked for me was to click on the RAM/Disk Resources drop down menu, then 'Manage Sessions' and terminate my current session which had been active for days. Then reconnect and run everything again.
Before that, my code kept crashing even though it was working perfectly the previous day, so I knew there was nothing wrong coding wise.
After doing this, I also realized that the parameter n_jobs
in GridSearchCV
(https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html) plays a massive role in GPU RAM consumption. For example, for me it works fine and execution doesn't crash if n_jobs
is set to None
, 1 (same as None
), or 2. Setting it to -1 (using all processors) or >3 crashes everything.
The common cause is an out-of-memory error, Possible reasons maybe you specified a larger batch size while training your model try to reduce the batch size
© 2022 - 2024 — McMap. All rights reserved.