Testing a Jupyter Notebook
Asked Answered
G

3

12

I am trying to come up with a method to test a number of Jupyter notebooks. A test should run when a new notebook is implemented in a Github branch and submitted for a pull request. The tests are not that complicated, they are mostly just testing if the notebook runs end-to-end and without any errors, and maybe a few asserts. However:

  • There are certain calls in some cells that need to be mocked, e.g. a call to download the data from a database.
  • There may be some magic cells in the notebooks which run a pip command or something else.

I am open to use any testing library, such as 'pytest' or unittest, although pytest is preferred.

I looked at a few libraries for testing notebooks such as nbmake, treon, and testbook, but I was unable to make them work. I also tried to convert the notebook to a python file, but the magic cells were converted to a get_ipython().run_cell_magic(...) call which became an issue, since pytest uses python and not ipython, and get_ipython() is only available in ipython.

So, I am wondering what is a good way to test jupyter notebooks with all of that in mind. Any help is appreciated.

Geniegenii answered 11/1, 2022 at 18:29 Comment(0)
G
4

Here is my own solution using testbook. Let's say I have a notebook called my_notebook.ipynb with the following content:

enter image description here

The trick is to inject a cell before my call to bigquery.Client and mock it:

from testbook import testbook

@testbook('./my_notebook.ipynb')
def test_get_details(tb):
    tb.inject(
        """
        import mock
        mock_client = mock.MagicMock()
        mock_df = pd.DataFrame()
        mock_df['week'] = range(10)
        mock_df['count'] = 5
        p1 = mock.patch.object(bigquery, 'Client', return_value=mock_client)
        mock_client.query().result().to_dataframe.return_value = mock_df
        p1.start()
        """,
        before=2,
        run=False
    )
    tb.execute()
    dataframe = tb.get('dataframe')
    assert dataframe.shape == (10, 2)

    x = tb.get('x')
    assert x == 7
Geniegenii answered 18/1, 2022 at 20:47 Comment(1)
This is excellent!!Wira
E
4

One straightforward approach I've already used is to execute the entire notebook with nbconvert.

A notebook failed.ipynb raising an exception will result in a failed run thanks to the --execute option that tells nbconvert to execute the notebook prior to its conversion.

jupyter nbconvert --to notebook --execute failed.ipynb
# ...
# Exception: FAILED
echo $?
# 1

Another correct notebook passed.ipynb will result in a successful export.

jupyter nbconvert --to notebook --execute passed.ipynb
# [NbConvertApp] Converting notebook passed.ipynb to notebook
# [NbConvertApp] Writing 1172 bytes to passed.nbconvert.ipynb
echo $?
# 0

Cherry on the cake, you can do the same through the API and so wrap it in Pytest!

import nbformat
import pytest
from nbconvert.preprocessors import ExecutePreprocessor

@pytest.mark.parametrize("notebook", ["passed.ipynb", "failed.ipynb"])
def test_notebook_exec(notebook):
  with open(notebook) as f:
      nb = nbformat.read(f, as_version=4)
      ep = ExecutePreprocessor(timeout=600, kernel_name='python3')
      try:
        assert ep.preprocess(nb) is not None, f"Got empty notebook for {notebook}"
      except Exception:
          assert False, f"Failed executing {notebook}"

Running the test gives.

pytest test_nbconv.py
# FAILED test_nbconv.py::test_notebook_exec[failed.ipynb] - AssertionError: Failed executing failed.ipynb
# PASSED test_nbconv.py::test_notebook_exec[passed.ipynb]

Notes

This doesn’t convert a notebook to a different format per se, instead it allows the running of nbconvert preprocessors on a notebook, and/or conversion to other notebook formats.

  • The python code example is just a quick draft it can be largely improved.
Eruct answered 11/1, 2022 at 21:10 Comment(3)
Thanks for the answer. Can you explain how you mock an object used in the notebook in this method (e.g. BigQuery Client)? Also, how do you deal with a magic cell (e.g. "!pip install pandas") ?Geniegenii
@shahins magic will be executed -- you can check what is going on by using the --debug flag. For the mock part I don't know. Mocking is a complex topic if you want to ensure that interfaces for all cells are available. Maybe it should be a rule in each notebook to provide a "mock" mode--a way of executing it without everything plugged. Without this it will be a hell to try to mock everything from the outside.Eruct
Nice, I think this solution works well as long as you don't need to mock objects. In my case however, that is a major requirement.Geniegenii
G
4

Here is my own solution using testbook. Let's say I have a notebook called my_notebook.ipynb with the following content:

enter image description here

The trick is to inject a cell before my call to bigquery.Client and mock it:

from testbook import testbook

@testbook('./my_notebook.ipynb')
def test_get_details(tb):
    tb.inject(
        """
        import mock
        mock_client = mock.MagicMock()
        mock_df = pd.DataFrame()
        mock_df['week'] = range(10)
        mock_df['count'] = 5
        p1 = mock.patch.object(bigquery, 'Client', return_value=mock_client)
        mock_client.query().result().to_dataframe.return_value = mock_df
        p1.start()
        """,
        before=2,
        run=False
    )
    tb.execute()
    dataframe = tb.get('dataframe')
    assert dataframe.shape == (10, 2)

    x = tb.get('x')
    assert x == 7
Geniegenii answered 18/1, 2022 at 20:47 Comment(1)
This is excellent!!Wira
M
2

There is also nbval (at https://nbval.readthedocs.io/en/latest/), or https://github.com/computationalmodelling/nbval for the source. It is an extension to pytest.

The basic idea is to execute the notebook and compare the computed output with output saved in the notebook file. Each cell is regarded as a test: if the cell's re-computed output matches that on disk, the test is considered to pass. (Otherwise a fail.)

The tests can be run using, for example:

$ py.test --nbval my_notebook.ipynb

In lax mode, the ouput of a cell is ignored, and the notebook passes as long as no exception are raised. This is a convenient way to check if any interface changes have broken the notebook (good to test notebook-based documentation):

$ py.test --nbval-lax my_notebook.ipynb

It is possible to skip particular cells, or ignore the output they produce, or expect exceptions to be raised. It is also possible to define (regexp) exceptions to---in effect---ignore output changes that are expected to differ in repeated executions (such as memory addresses, execution date and time, or run time).

I don't think mocking is possible.

The package is available on PyPI (https://pypi.org/project/nbval/).

A video on the nbval package (from Thomas Kluyver at the EuroSciPy 2017) is available at https://youtu.be/o-6nuopYIcs?si=NACh-NZaBkx4CZWq .

Moira answered 17/12, 2023 at 11:18 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.