I am frequently rerunning the same mxnet
script while I try to iron out some bugs in a new script (and I am new to mxnet
). Pretty often when I try to run my script I get an error that the GPU is out of memory, and when I use nvidia-smi
to check, this is what I see:
Wed Dec 5 15:41:29 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.24.02 Driver Version: 396.24.02 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... Off | 00000000:65:00.0 On | N/A |
| 0% 54C P2 68W / 300W | 10891MiB / 11144MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1446 G /usr/lib/xorg/Xorg 40MiB |
| 0 1481 G /usr/bin/gnome-shell 114MiB |
| 0 10216 G ...-token=8422C9FC67F51AEC1893FEEBE9DB68C6 31MiB |
| 0 18221 G /usr/lib/xorg/Xorg 458MiB |
| 0 18347 G /usr/bin/gnome-shell 282MiB |
+-----------------------------------------------------------------------------+
So it seems like most of the memory is in use (10891/11144) BUT I don't see any process in the list taking up a large portion of the GPU, so there doesn't seem to be anything to call. And my mxnet script has been exited out, so I assume it shouldn't be that. I would understand if there were some seconds or even tens of seconds lagging if the GPU does not know right away that the script no longer needs memory, but I am going on many minutes and still see the same display. What gives, and is there some memory cleanup I should do? If so, how? Thank you for any tips to a newbie.