PySpark: How can I suppress %run output in PySpark cell when importing variables from another Notebook?

Asked 2/3, 2020 at 10:14 Answered 14/2, 2023 at 13:45

Solved python import pyspark jupyter-notebook databricks

I am using multiple notebooks in PySpark and import variables across these notebooks using %run path. Every time I run the command, all variables that I displayed in the original notebook are being displayed again in the current notebook (the notebook in which I %run). But I do not want them to be displayed in the current notebook. I only want to be able to work with the imported variables. How do I suppress the output being display every time? Note, I am not sure if it matters, but I am working in DataBricks. Thank you!

Command example:

%run /Users/myemail/Nodebook

Linkwork answered 2/3, 2020 at 10:14 Comment(0)

You can use the "Hide Result" option in the upper right toggle of the cell:

Chalcography answered 4/11, 2021 at 19:23 Comment(1)

While this is possible, it also hides any error outputs. So in case your notebook doesn't run, you won't be able to tell from the output. – Palua 21/11, 2023 at 11:59

The correct answer is stated in the comments of one of the answers ("you can't"). I ran into the problem that markdown titles (which I use to divide a notebook into sections) were shown when the notebook was %run in another notebook.

I found a workaround where you can keep headings in your notebook to be loaded. Instead of using %md markdown you can give cells a title by using the dropdown menu on the top right of the cell and choose "show title", this will give your cell a title like this:

And this text doesn't show up when you import the notebook using %run in another notebook.

Decibel answered 14/2, 2023 at 13:45 Comment(0)

-3

This is expected behaviour, when you use %run command allows you to include another notebook within a notebook. This command lets you concatenate various notebooks that represent key ETL steps, Spark analysis steps, or ad-hoc exploration. However, it lacks the ability to build more complex data pipelines.

Notebook workflows are a complement to %run because they let you return values from a notebook. This allows you to easily build complex workflows and pipelines with dependencies. You can properly parameterize runs (for example, get a list of files in a directory and pass the names to another notebook—something that’s not possible with %run) and also create if/then/else workflows based on return values. Notebook workflows allow you to call other notebooks via relative paths.

You implement notebook workflows with dbutils.notebook methods. These methods, like all of the dbutils APIs, are available only in Scala and Python. However, you can use dbutils.notebook.run to invoke an R notebook.

For more details, refer "Databricks - Notebook workflows".

Livialivid answered 5/3, 2020 at 8:52 Comment(2)

Unfortunately, I only have access to the Community Version of Databricks right now. But I image to be able to pass arguements across notebooks with something like this: dbutils.notebook.run("notebook-name", 60, {"argument": "data", "argument2": "data2", ...}) – Linkwork 14/4, 2020 at 14:24

This is NOT an answer to the question - it is simply a restatement of the issue. Our development uses notebooks as modules for shared routines, and each module has a markdown comment block at the top with history and usage stuff in it. Every notebook that includes them has all this stuff everywhere - it's ugly. That is not expected - that is just what is. That is the question that was asked - the accepted answer should be "you can't". – Difficulty 9/7, 2021 at 20:21

Recommended topics

Hot tags