Search Jupyter notebook markdown cells from command line
Asked Answered
I

2

6

I use ag to search through my notes. My notes are written down in Markdown files and Markdown cells contained within Jupyter notebooks.

I can search the Markdown files conveniently with ag --markdown .... It would be very handy if something similar could be done with the Jupyter notebook files. But this would require that ag understands the format of these notebooks.

My question: is there a way to search only the Markdown cells for a given string in a Jupyter notebook file? Any pattern matcher used in the solution is acceptable for me (ag, grep, ack, ...).

p.s. The notebooks are composed in JSON. Here's a sample:

$ head notebook.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "THIS IS A MARKDOWN STRING"
   ]
  },
  {
Infertile answered 9/5, 2018 at 14:11 Comment(2)
Can you expand on your question. From the sounds of it, your just asking how to provide a path to the directory you which to search, which is the standard approach to using ag.Inness
Thanks for your interest. Hopefully I've clarified my question -- see the edit.Infertile
I
3

I'd look to use jq to filter out all markdown cells of a python notebook. For instance, if you just wanted to spit out all the markdown source, you could use the following:

$< notebook.ipynb | jq '.cells[]|select(.cell_type == "markdown")|.source[]'

jq is fast, and used for far more elaborate solutions when saving ipython notebooks to git, for example: Using IPython notebooks under version control

Inness answered 6/6, 2018 at 5:35 Comment(1)
This is indeed an order of magnitude faster than @gboffi's answer. Thanks.Infertile
A
2

I don't know if ag can be interfaced with a filter, but to get the Markdown out of a notebook file the following Python code will suffice

import nbformat
from sys import argv
nb = nbformat.read(argv[1], nbformat.NO_CONVERT)
for cell in nb.cells:
    if cell.cell_type == 'markdown' : print(cell.source)
Almsman answered 5/6, 2018 at 14:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.