Jupyter Notebook Kernel Keeps dying - low memory?
Asked Answered
L

1

7

I am trying two different lines of code that both involve computing combinations of rows of a df with 500k rows.

I think bc of the large # of combinations, the kernal keeps dying. Is there anyway to resolve this ?

enter image description here

Both lines of code that crash are

pd.merge(df.assign(key=0), df.assign(key=0), on='key').drop('key', axis=1)

and

index_comb = list(combinations(df.index, 2))

Both are different ways to achieve same desired df but kernal fails on both.

Would appreciate any help :/

Update: I tried using the code in my terminal and it gives me an error of killed 9: it is using too much memory in terminal as well?

Lillia answered 11/3, 2019 at 7:43 Comment(0)
J
2

There is no solution here that I know of. Jupyter Notebook simply is not designed to handle huge quantities of data. Compile your code in a terminal, that should work.

In case you run into the same problem when using a terminal look here: Python Killed: 9 when running a code using dictionaries created from 2 csv files

Edit: I ended up finding a way to potentially solve this: Increasing your container size should prevent Jupyter from running out of memory. In order to do so open the settings.cfg file of jupyter in the home Directory of your Notebook $CHORUS_NOTEBOOK_HOME The line to edit is this one:

#default memory per container

MEM_LIMIT_PER_CONTAINER=“1g”

The default value should be 1 gb per container, increasing this to 2 or 4 gb should help with memory related crashes. However I am unsure of any implications this has on performance, so be warned!

Jerome answered 11/3, 2019 at 8:29 Comment(8)
Thanks :) I dont really use terminal - would reading csvs and all that work? pd.read_csv('x.csv') would I need to alter this to point to the directory where the csv is as well?Lillia
also if you want to output a df as a csv from terminal can you direct that csv to live somwhere specific directory wise? @neweyesLillia
Code almost never needs to be altered when coming from a notebook. Ipython works under jupyter so you should simply be able to copy paste your code into a file (located in the same directory as the notebook) and run it via typing: ipython filename.pyJerome
I am having trouble understanding your second quiestion... Do you just want to save your output at a specific location?Jerome
Sorry about that - I updated post - I tried in terminal and I get killed 9 error - it seems too less memory for terminal as well? I dont understand @neweyesLillia
Edited a link into my answer.Jerome
@Jerome Could you elaborate on what "huge quantities" of data is? I've used 10Ks datasets without a hitch, and I'm just wondering where the performance degradations occur that make you say this.Troytroyer
First of all I am unsure where a significant performance degradation actually happens with increasing data. Jupyter just crashes at some point, when the data becomes too big. How big too big is depends on how much memory you have enabled for your containers. I recently found out how to do this and will edit it into my answerJerome

© 2022 - 2024 — McMap. All rights reserved.