Converting to (not from) ipython Notebook format
Asked Answered
T

10

76

IPython Notebook comes with nbconvert, which can export notebooks to other formats. But how do I convert text in the opposite direction? I ask because I already have materials, and a good workflow, in a different format, but I would like to take advantage of Notebook's interactive environment.

A likely solution: A notebook can be created by importing a .py file, and the documentation states that when nbconvert exports a notebook as a python script, it embeds directives in comments that can be used to recreate the notebook. But the information comes with a disclaimer about the limitations of this method, and the accepted format is not documented anywhere that I could find. (A sample is shown, oddly enough, in the section describing notebook's JSON format). Can anyone provide more information, or a better alternative?

Edit (1 March 2016): The accepted answer no longer works, because for some reason this input format is not supported by version 4 of the Notebook API. I have added a self-answer showing how to import a notebook with the current (v4) API. (I am not un-accepting the current answer, since it solved my problem at the time and pointed me to the resources I used in my self-answer.)

Tarsometatarsus answered 25/4, 2014 at 11:48 Comment(0)
M
44

The following works for IPython 3, but not IPython 4.

The IPython API has functions for reading and writing notebook files. You should use this API and not create JSON directly. For example, the following code snippet converts a script test.py into a notebook test.ipynb.

import IPython.nbformat.current as nbf
nb = nbf.read(open('test.py', 'r'), 'py')
nbf.write(nb, open('test.ipynb', 'w'), 'ipynb')

Regarding the format of the .py file understood by nbf.read it is best to simply look into the parser class IPython.nbformat.v3.nbpy.PyReader. The code can be found here (it is not very large):

https://github.com/ipython/ipython/blob/master/jupyter_nbformat/v3/nbpy.py

Edit: This answer was originally written for IPyhton 3. I don't know how to do this properly with IPython 4. Here is an updated version of the link above, pointing to the version of nbpy.py from the IPython 3.2.1 release:

https://github.com/ipython/ipython/blob/rel-3.2.1/IPython/nbformat/v3/nbpy.py

Basically you use special comments such as # <codecell> or # <markdowncell> to separate the individual cells. Look at the line.startswith statements in PyReader.to_notebook for a complete list.

Misdo answered 25/4, 2014 at 12:8 Comment(10)
That's good to know, thanks. I had no plans to generate JSON directly, but rather to generate python with the right directives and let notebook import it; your code makes it possible to do this automatically rather than manually. But the question remains: what notebook directives can the python script contain?Tarsometatarsus
@Tarsometatarsus I've now added some additional notes about the special directives in .py notebook files.Misdo
Thanks! That and the sample .py file I linked to in my question seem like a sufficient reference. For completeness, the current directives are <nbformat> (3.0), <codecell>, <htmlcell>, <markdowncell>, <rawcell>, and <headincell level=N>. (I'm ignoring the deprecated <plaintextcell>.)Tarsometatarsus
Sorry but I'm re-opening the question, because I can't get this to work yet! I can't open any .py file in notebook, as I should be able to. I've opened a separate question about it because I suspect the problem might be with my installation, but if you have any ideas I'd appreciate them.Tarsometatarsus
@Tarsometatarsus I think the notebook application can only open .ipynb, not .py files. How did you "export" the .py files from notebook? Was it with "File->Downalod as"? I only have IPython 1.2.1 here at the moment and this version creates files with the <...> tags. So I'm afraid I don't have any additional expertise to offer..Misdo
At least on my installation, this import statement gives UserWarning: IPython.nbformat.current is deprecated.Beheld
That link appears dead now, is there an alternative?Ranking
@Alex, how did that work for you? As far as I can tell, the python import/export format has been dropped from notebook format v.4-- the tokens "codecell" and "markdowncell" don't even appear in the source. I've resorted to reading python with a v3 notebook reader, then saving as v4.Tarsometatarsus
@Tarsometatarsus yes I think you're right, I couldn't see any of that code in v4 either. I had to resort to converting it to a v3 notebook, opening it and letting IPython convert it to v4 like you. It is by no means a clean solution, but seems to work.Ranking
@Alex, I have added a self-answer showing how to import notebooks and save them in v4 format immediately. (Also how to get around a nasty bug...)Tarsometatarsus
T
45

Since the code in the accepted answer does not work anymore, I have added this self-answer that shows how to import into a notebook with the current (v4) API.

Input format

Versions 2 and 3 of the IPython Notebook API can import a python script with special structuring comments, and break it up into cells as desired. Here's a sample input file (original documentation here). The first two lines are ignored, and optional. (In fact, the reader will ignore coding: and <nbformat> lines anywhere in the file.)

# -*- coding: utf-8 -*-
# <nbformat>3.0</nbformat>

# <markdowncell>

# The simplest notebook. Markdown cells are embedded in comments, 
# so the file is a valid `python` script. 
# Be sure to **leave a space** after the comment character!

# <codecell>

print("Hello, IPython")

# <rawcell>

# Raw cell contents are not formatted as markdown

(The API also accepts the obsolete directives <htmlcell> and <headingcell level=...>, which are immediately transformed to other types.)

How to import it

For some reason, this format is not supported by version 4 of the Notebook API. It's still a nice format, so it's worth the trouble to support it by importing into version 3 and upgrading. In principle it's just two lines of code, plus i/o:

from IPython.nbformat import v3, v4

with open("input-file.py") as fpin:
    text = fpin.read()

nbook = v3.reads_py(text)
nbook = v4.upgrade(nbook)  # Upgrade v3 to v4

jsonform = v4.writes(nbook) + "\n"
with open("output-file.ipynb", "w") as fpout:
    fpout.write(jsonform)

But not so fast! In fact, the notebook API has a nasty bug: If the last cell in the input is a markdown cell, v3.reads_py() will lose it. The simplest work-around is to tack on a bogus <markdown> cell at the end: The bug will delete it, and everyone is happy. So do the following before you pass text to v3.reads_py():

text += """
# <markdowncell>

# If you can read this, reads_py() is no longer broken! 
"""
Tarsometatarsus answered 1/3, 2016 at 10:8 Comment(10)
Nice. Check also this script I've written that works with spyder and pycharm cell markers plus some extra candies for slide making. PS: Might want to this to your edit (and i think you meant 2016) :)Staphyloplasty
Oops! Yeah indeed it's 2016 :-)Tarsometatarsus
Hmm tried quickly your script above but getting an error message ValueError: dictionary update sequence element #0 has length 1; 2 is required. That is on the fpout.write statement.Staphyloplasty
Using python 2.7.11 :: Anaconda 2.4.1 (x86_64), jupyter 4.0.6 and notebook 4.1.0. What about you?Staphyloplasty
Oops! I forgot a call to v4.writes(), which generates json. Fixed now, thanks for catching it!Tarsometatarsus
Great. Can I integrate your code to my script (with appropriate credit) for completeness?Staphyloplasty
Awesome. Will do so when I get a chance. Shouldn't take long. Out of curiosity, are there any docs for the tags like # <codecell>, # <markdowncell>, etc. Websearch didn't give me anything useful.Staphyloplasty
The only place I've found a list of the cell types is the source (see the accepted answer for the link). To see which cells are ignored or obsolete, see there and also skip over to v4/convert.py.Tarsometatarsus
Great. Thanks. Included your code on master branch :).Staphyloplasty
Thank you for posting this! I've added a semi-generic version of it, with the adjustment, and a simple unittest: script | testNeona
M
44

The following works for IPython 3, but not IPython 4.

The IPython API has functions for reading and writing notebook files. You should use this API and not create JSON directly. For example, the following code snippet converts a script test.py into a notebook test.ipynb.

import IPython.nbformat.current as nbf
nb = nbf.read(open('test.py', 'r'), 'py')
nbf.write(nb, open('test.ipynb', 'w'), 'ipynb')

Regarding the format of the .py file understood by nbf.read it is best to simply look into the parser class IPython.nbformat.v3.nbpy.PyReader. The code can be found here (it is not very large):

https://github.com/ipython/ipython/blob/master/jupyter_nbformat/v3/nbpy.py

Edit: This answer was originally written for IPyhton 3. I don't know how to do this properly with IPython 4. Here is an updated version of the link above, pointing to the version of nbpy.py from the IPython 3.2.1 release:

https://github.com/ipython/ipython/blob/rel-3.2.1/IPython/nbformat/v3/nbpy.py

Basically you use special comments such as # <codecell> or # <markdowncell> to separate the individual cells. Look at the line.startswith statements in PyReader.to_notebook for a complete list.

Misdo answered 25/4, 2014 at 12:8 Comment(10)
That's good to know, thanks. I had no plans to generate JSON directly, but rather to generate python with the right directives and let notebook import it; your code makes it possible to do this automatically rather than manually. But the question remains: what notebook directives can the python script contain?Tarsometatarsus
@Tarsometatarsus I've now added some additional notes about the special directives in .py notebook files.Misdo
Thanks! That and the sample .py file I linked to in my question seem like a sufficient reference. For completeness, the current directives are <nbformat> (3.0), <codecell>, <htmlcell>, <markdowncell>, <rawcell>, and <headincell level=N>. (I'm ignoring the deprecated <plaintextcell>.)Tarsometatarsus
Sorry but I'm re-opening the question, because I can't get this to work yet! I can't open any .py file in notebook, as I should be able to. I've opened a separate question about it because I suspect the problem might be with my installation, but if you have any ideas I'd appreciate them.Tarsometatarsus
@Tarsometatarsus I think the notebook application can only open .ipynb, not .py files. How did you "export" the .py files from notebook? Was it with "File->Downalod as"? I only have IPython 1.2.1 here at the moment and this version creates files with the <...> tags. So I'm afraid I don't have any additional expertise to offer..Misdo
At least on my installation, this import statement gives UserWarning: IPython.nbformat.current is deprecated.Beheld
That link appears dead now, is there an alternative?Ranking
@Alex, how did that work for you? As far as I can tell, the python import/export format has been dropped from notebook format v.4-- the tokens "codecell" and "markdowncell" don't even appear in the source. I've resorted to reading python with a v3 notebook reader, then saving as v4.Tarsometatarsus
@Tarsometatarsus yes I think you're right, I couldn't see any of that code in v4 either. I had to resort to converting it to a v3 notebook, opening it and letting IPython convert it to v4 like you. It is by no means a clean solution, but seems to work.Ranking
@Alex, I have added a self-answer showing how to import notebooks and save them in v4 format immediately. (Also how to get around a nasty bug...)Tarsometatarsus
I
44

very old question, i know. but there is jupytext (also available on pypi) that can convert from ipynb to several formats and back.

when jupytext is installed you can use

$ jupytext --to notebook test.py

in order to generate test.ipynb.

jupytext has a lot more interesting features that can come in handy when working with notebooks.


here is a more recent question on that topic.

Immanent answered 20/11, 2018 at 7:9 Comment(2)
Nice to know, thanks for the update! I'll check it out.Tarsometatarsus
@alexis, could you add this answer to the text your question? Because it really works in fall 2019 and it is absolutely hassle free.Atomic
F
11

Python code example how to build IPython notebook V4:

# -*- coding: utf-8 -*-
import os
from base64 import encodestring

from IPython.nbformat.v4.nbbase import (
    new_code_cell, new_markdown_cell, new_notebook,
    new_output, new_raw_cell
)

# some random base64-encoded *text*
png = encodestring(os.urandom(5)).decode('ascii')
jpeg = encodestring(os.urandom(6)).decode('ascii')

cells = []
cells.append(new_markdown_cell(
    source='Some NumPy Examples',
))


cells.append(new_code_cell(
    source='import numpy',
    execution_count=1,
))

cells.append(new_markdown_cell(
    source='A random array',
))

cells.append(new_raw_cell(
    source='A random array',
))

cells.append(new_markdown_cell(
    source=u'## My Heading',
))

cells.append(new_code_cell(
    source='a = numpy.random.rand(100)',
    execution_count=2,
))
cells.append(new_code_cell(
    source='a = 10\nb = 5\n',
    execution_count=3,
))
cells.append(new_code_cell(
    source='a = 10\nb = 5',
    execution_count=4,
))

cells.append(new_code_cell(
    source=u'print "ünîcødé"',
    execution_count=3,
    outputs=[new_output(
        output_type=u'execute_result',
        data={
            'text/plain': u'<array a>',
            'text/html': u'The HTML rep',
            'text/latex': u'$a$',
            'image/png': png,
            'image/jpeg': jpeg,
            'image/svg+xml': u'<svg>',
            'application/json': {
                'key': 'value'
            },
            'application/javascript': u'var i=0;'
        },
        execution_count=3
    ),new_output(
        output_type=u'display_data',
        data={
            'text/plain': u'<array a>',
            'text/html': u'The HTML rep',
            'text/latex': u'$a$',
            'image/png': png,
            'image/jpeg': jpeg,
            'image/svg+xml': u'<svg>',
            'application/json': {
                'key': 'value'
            },
            'application/javascript': u'var i=0;'
        },
    ),new_output(
        output_type=u'error',
        ename=u'NameError',
        evalue=u'NameError was here',
        traceback=[u'frame 0', u'frame 1', u'frame 2']
    ),new_output(
        output_type=u'stream',
        text='foo\rbar\r\n'
    ),new_output(
        output_type=u'stream',
        name='stderr',
        text='\rfoo\rbar\n'
    )]
))

nb0 = new_notebook(cells=cells,
    metadata={
        'language': 'python',
    }
)

import IPython.nbformat as nbf
import codecs
f = codecs.open('test.ipynb', encoding='utf-8', mode='w')
nbf.write(nb0, f, 4)
f.close()
Finished answered 26/7, 2015 at 17:21 Comment(1)
Thanks for the effort. It doesn't really fit the question (importing a file), but it could be useful to other aspiring notebook hackers.Tarsometatarsus
C
11

Hope I'm not too late.

I just published a Python package on PyPI called p2j. This package creates a Jupyter notebook .ipynb from a Python source code .py.

pip install p2j
p2j script.py

Example of a Jupyter notebook generated from a .py file:

Example of .ipynb generated from a .py file

PyPI: https://pypi.org/project/p2j/

GitHub: https://github.com/remykarem/python2jupyter

Crawler answered 7/3, 2019 at 4:42 Comment(3)
Thanks for the report. Did you think about headings, raw cells, comments that should remain python comments, and other notebook format features? There's no mention of them in your github page. Also, it would be smart to add special handling for emacs directives and the like (# -*- coding: utf-8 -*- ), since you use them yourself. Finally, please indicate whether you are using the jupyter api or coding the json generation yourself.Tarsometatarsus
Hi @alexis. I've just updated the readme on my GitHub repo, and included the directives. Hopefully that would answer your first question. For your last question, I'm coding the JSON generation by myself.Crawler
you're not too late for us future readers! :-) I noticed that I had to delete a lot of white space (blank lines) and hit "run" many times to step through, but I got some nice results. Unfortunately my plot is interactive using widgets and my notebook only shows an image of the plot not a real, interactive one. I guess I'll have to research that myself and so some post-modification of the converted notebook Jupyter Notebook: interactive plot with widgetsStillness
C
7

Given the example by Volodimir Kopey, I put together a bare-bones script to convert a .py obtained by exporting from a .ipynb back into a V4 .ipynb.

I hacked this script together when I edited (in a proper IDE) a .py I had exported from a Notebook and I wanted to go back to Notebook to run it cell by cell.

The script handles only code cells. The exported .py does not contain much else, anyway.

import nbformat
from nbformat.v4 import new_code_cell,new_notebook

import codecs

sourceFile = "changeMe.py"     # <<<< change
destFile = "changeMe.ipynb"    # <<<< change


def parsePy(fn):
    """ Generator that parses a .py file exported from a IPython notebook and
extracts code cells (whatever is between occurrences of "In[*]:").
Returns a string containing one or more lines
"""
    with open(fn,"r") as f:
        lines = []
        for l in f:
            l1 = l.strip()
            if l1.startswith('# In[') and l1.endswith(']:') and lines:
                yield "".join(lines)
                lines = []
                continue
            lines.append(l)
        if lines:
            yield "".join(lines)

# Create the code cells by parsing the file in input
cells = []
for c in parsePy(sourceFile):
    cells.append(new_code_cell(source=c))

# This creates a V4 Notebook with the code cells extracted above
nb0 = new_notebook(cells=cells,
                   metadata={'language': 'python',})

with codecs.open(destFile, encoding='utf-8', mode='w') as f:
    nbformat.write(nb0, f, 4)

No guarantees, but it worked for me

Camlet answered 7/10, 2015 at 13:56 Comment(1)
Thanks! I already have a whole toolchain using the # <...cell> format, but your solution could be useful to others-- especially since your format is generated when Notebook exports. Could you document it better (parsePy returns lists of lines? One per cell? How can you tell markup from code cells?) and add a sample input file to the answer text?Tarsometatarsus
O
5

I wrote an extension for vscode that might help. It converts the python files to ipython notebooks. It's in early stages so if any error occurs, feel free to submit an issue.

Jupyter Notebook Converter

Ondrea answered 27/2, 2018 at 17:5 Comment(0)
S
4

Took the liberty of taking and modifying the code of P.Toccateli and alexis so that it will also work with pycharm and spyder like cell markers and released it on github.

Staphyloplasty answered 23/2, 2016 at 12:8 Comment(1)
Thanks. Should be useful for anyone working with these formats.Tarsometatarsus
A
1

Some improvement to @p-toccaceli answer. Now, it also restores markdown cells. Additionally, it trims empty hanging lines for each cell.

    import nbformat
    from nbformat.v4 import new_code_cell,new_markdown_cell,new_notebook

    import codecs

    sourceFile = "changeMe.py"     # <<<< change
    destFile = "changeMe.ipynb"    # <<<< change


    def parsePy(fn):
        """ Generator that parses a .py file exported from a IPython notebook and
    extracts code cells (whatever is between occurrences of "In[*]:").
    Returns a string containing one or more lines
    """
        with open(fn,"r") as f:
            lines = []
            for l in f:
                l1 = l.strip()
                if l1.startswith('# In[') and l1.endswith(']:') and lines:
                    yield ("".join(lines).strip(), 0)
                    lines = []
                    continue
                elif l1.startswith('# ') and l1[2:].startswith('#') and lines:
                    yield ("".join(lines).strip(), 0)

                    yield (l1[2:].strip(), 1)
                    lines = []
                    continue
                lines.append(l)
            if lines:
                yield ("".join(lines).strip(), 0)

    # Create the code cells by parsing the file in input
    cells = []
    for c, code in parsePy(sourceFile):
        if len(c) == 0:
            continue
        if code == 0:
            cells.append(new_code_cell(source=c))
        elif code == 1:
            cells.append(new_markdown_cell(source=c))

    # This creates a V4 Notebook with the code cells extracted above
    nb0 = new_notebook(cells=cells,
                       metadata={'language': 'python',})

    with codecs.open(destFile, encoding='utf-8', mode='w') as f:
        nbformat.write(nb0, f, 4)
Adalie answered 28/1, 2020 at 9:8 Comment(1)
Note that there is already an accepted answer to this question. Please edit your answer to ensure that it improves upon other answers already present in this question. While it might answer the question, just adding some code does not help OP or future community members understand the issue or solution.Saleswoman
Z
0

You can use the script py2nb from https://github.com/sklam/py2nb

You will have to use a certain syntax for your *.py but it's rather simple to use (look at the example in the 'samples' folder)

Zavras answered 8/3, 2018 at 11:18 Comment(1)
What would be the benefit of this ad hoc solution (with an ad hoc format and unknown bugs and limitations), instead of the format supported by notebook itself?Tarsometatarsus

© 2022 - 2024 — McMap. All rights reserved.