How to install a library on a databricks cluster using some command in the notebook?
Asked Answered
N

3

8

Actually I want to install a library on my Azure databricks cluster but I cannot use the UI method because every time my cluster would change and in transition I cannot add library to it using UI. Is there any databricks utility command for doing this?

Nanon answered 5/3, 2020 at 11:0 Comment(1)
have you tried databricks libraries CLI then install the library from DBFS.Addington
W
0

@CHEEKATLAPRADEEP-MSFT's answer is awesome! Just a complement:

If you want all your notebooks / clusters to have the same libs installed, you can take advantage of cluster-scoped or global (new feature) init scripts.

The example below retrieves packages from PyPi:

#!/bin/sh

# Install dependencies
pip install --upgrade boto3 psycopg2-binary requests simple-salesforce

You can even use a private package index - for example AWS CodeArtifact:

#Install AWS CLI
pip install --upgrade awscli

# Configure pip
aws codeartifact login --region <REGION> --tool pip --domain <DOMAIN> --domain-owner <AWS_ACCOUNT_ID> --repository <REPO>
pip config set global.extra-index-url https://pypi.org/simple

Note: the cluster instance profile must be allowed to get CodeArtifact credentials (arn:aws:iam::aws:policy/AWSCodeArtifactReadOnlyAccess).

Cheers

Workmanlike answered 22/3, 2021 at 23:17 Comment(4)
Follow up question. How do you configure the instance to get the AWS credentials?Pontificals
@Pontificals We currently attach an "Instance profile" in the Advanced options of the cluster configuration page.Workmanlike
@CHEEKATLAPRADEEP's answer? I don't see that here? I you referring to some other post or perhaps it was delete?Goya
that answer was removed by StackOverflow moderator and couldn't be restored until another moderator will chime in.Steere
S
0

You can use %pip install command to install the required libraries from within your notebook code. This documentation provides further detail on its usage: https://docs.databricks.com/libraries/notebooks-python-libraries.html. For example:

!pip install requests

For older runtimes there was dbutils.library utility (https://docs.databricks.com/dev-tools/databricks-utils.html#dbutils-library) but it was deprecated.

Shampoo answered 24/4, 2023 at 16:29 Comment(0)
P
0

You need to run a simple command in a separate shell. Do not write anything apart from pip install like this.

pip install nltk
Pristine answered 23/7 at 11:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.