Get the current contents of the entire Jupyter Notebook
Asked Answered
S

3

5

I have a Jupyter Notebook running. I want to be able to access the source of the current Jupyter Notebook from within Python. My end goal is to pass it into ast.parse so I can do some analysis on the user's code. Ideally, I'd be able to do something like this:

import ast
ast.parse(get_notebooks_code())

Obviously, if the source code was an IPYNB file, there'd be an intermediary step of extracting the code from the Python cells, but that's a relatively easy problem to solve.

So far, I've found code that will use the list_running_servers function of the IPython object in order to make a request and match up kernel IDs - this gives me the filename of the currently running notebook. This would work, except for the fact that the source code on disk may not match up with what the user has in the browser (until you save a new checkpoint).

I've seen some ideas involving extracting out data using JavaScript, but that requires either a separate cell with magic or calling the display.Javascript function - which fires asynchronously, and therefore doesn't allow me to pass the result to ast.parse.

Anyone have any clever ideas for how to dynamically get the current notebooks source code available as a string in Python for immediate processing? I'm perfectly fine if I need to make this be an extension or even a kernel wrapper, I just need to get the source code somehow.

Starling answered 1/7, 2018 at 8:29 Comment(1)
I agree, it would be nice if there were something that ties into Jupyter/IPython natively in Python. Something with an interface like jupyter.notebook.dumps(). I haven't found it yet either.Rowell
S
5

Well, this isn't exactly what I wanted, but here's my current strategy. I need to run some Python code based on the user's code, but it doesn't actually have to be connected to the user's code directly. So I'm just going to run the following magic afterwards:

%%javascript
// Get source code from cells
var source_code = Jupyter.notebook.get_cells().map(function(cell) {
    if (cell.cell_type == "code") {
        var source = cell.code_mirror.getValue();
        if (!source.startsWith("%%javascript")) {
            return source;
        }
    }
}).join("\n");
// Embed the code as a Python string literal.
source_code = JSON.stringify(source_code);
var instructor_code = "student_code="+source_code;
instructor_code += "\nimport ast\nprint(ast.dump(ast.parse(student_code)))\nprint('Great')"
// Run the Python code along with additional code I wanted.
var kernel = IPython.notebook.kernel;
var t = kernel.execute(instructor_code, { 'iopub' : {'output' : function(x) {
    if (x.msg_type == "error") {
        console.error(x.content);
        element.text(x.content.ename+": "+x.content.evalue+"\n"+x.content.traceback.join("\n"))
    } else {
        element.html(x.content.text.replace(/\n/g, "<br>"));
        console.log(x);
    }
}}});
Starling answered 1/7, 2018 at 21:6 Comment(0)
H
1

What about combining https://mcmap.net/q/75972/-how-do-i-get-the-current-ipython-jupyter-notebook-name and https://mcmap.net/q/555923/-get-only-the-code-out-of-jupyter-notebook ? That gives something like

%%javascript
IPython.notebook.kernel.execute('nb_name = "' + IPython.notebook.notebook_name + '"')

and

import os
from nbformat import read, NO_CONVERT

nb_full_path = os.path.join(os.getcwd(), nb_name)
with open(nb_full_path) as fp:
    notebook = read(fp, NO_CONVERT)
cells = notebook['cells']
code_cells = [c for c in cells if c['cell_type'] == 'code']
for no_cell, cell in enumerate(code_cells):
    print(f"####### Cell {no_cell} #########")
    print(cell['source'])
print("")

I get

####### Cell 0 #########
%%javascript
IPython.notebook.kernel.execute('nb_name = "' + IPython.notebook.notebook_name + '"')

####### Cell 1 #########
import os
from nbformat import read, NO_CONVERT

nb_full_path = os.path.join(os.getcwd(), nb_name)
with open(nb_full_path) as fp:
    notebook = read(fp, NO_CONVERT)
cells = notebook['cells']
code_cells = [c for c in cells if c['cell_type'] == 'code']
for no_cell, cell in enumerate(code_cells):
    print(f"####### Cell {no_cell} #########")
    print(cell['source'])
    print("")
Hagi answered 31/1, 2022 at 1:20 Comment(4)
Doesn't this rely on the latest version of the code being saved before being executed?Starling
Yes, Indeed @AustinCoryBart. You can save the notebook from within itself though if you want. Something like from IPython.display import display, Javascript; display(Javascript('IPython.notebook.save_checkpoint();')), from https://mcmap.net/q/465101/-save-an-ipython-notebook-programmatically-from-within-itselfStuder
Is there a way to guarantee that the save finishes before the execution loads the file? I seem to recall going down this path and that being a major issue.Starling
I am not sure if any of these functions run in a separate thread. As long as they don't, you should be fine... but I can't say for sure without more investigation.Studer
T
0

First of all, use the following code inside your current Notebook:

import os
print(os.getenv("JPY_SESSION_NAME"))

It will show you the full path of your current Notebook.

JPY_SESSION_NAME is a local environment variable created dynamically as part of the Notebook itself.

Note: If you rename your Notebook, you will need to restart it to get the new path in JPY_SESSION_NAME.

You're welcome.

Tayler answered 24/5 at 11:34 Comment(1)
Please don't post the same answer at multiple questions. If you think that the questions are so similar that this should be an option, choose one and mark the others as a duplicateRefrigerant

© 2022 - 2024 — McMap. All rights reserved.