I am facing some issues with GCP and the AI Platform (Jupyterlab) It seems that I am unable to maintain a stable connection with the server for a long time. I keep getting those 'server connection error' message. From there two possibilities:
- either nothing happens and my cell keeps running or
- the cells have stopped running and I can see the status 'No Kernel!' on the top right of the notebook. Whenever I select a kernel (python 3) again, depending on my luck I can either keep working, or the cell will display the running status (with the * on the left of it) but the kernel status on the bottom left will stay on : 'connected' (instead of 'busy'). For the latter, I need to restart the kernel and run all the cells again, which can be very long.
Sometimes this happens as soon as I run the first cell after (re)starting the instance, sometimes a bit later. The longest stable period I was able to work on the notebook without any issue was 20, 30-ish minutes, which is quite annoying.
Configuration of my main instance : - 16 CPUs - 60gb of RAM - A P100 NVIDIA GPU
I have tried different types of instance and I keep having the same problem, network at home is stable.