Do you know how to install the 'ODBC Driver 17 for SQL Server' on a Databricks cluster?
Asked Answered
P

4

13

I'm trying to connect from a Databricks notebook to an Azure SQL Datawarehouse using the pyodbc python library. When I execute the code I get this error:

Error: ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)")

I understand that I need to install this driver but I have no idea how to do it. I have a Databricks cluster runing with Runetime 6.4, Standard_DS3_v2.

Paganini answered 4/4, 2020 at 1:50 Comment(2)
have a look at the shared link https://stackoverflow.com/questions/54132249/how-to-install-pyodbc-in-databricksStella
I've got the same issue using Azure Synapse AnalyticsPeirsen
H
25

By default, Azure Databricks does not have ODBC Driver installed.

Run the following commands in a single cell to install MS SQL ODBC Driver on Azure Databricks cluster.

%sh
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get -q -y install msodbcsql17

enter image description here

Houseman answered 27/4, 2020 at 9:48 Comment(3)
You mean "... MS SQL ODBC ..."?Gregorygregrory
Same script is failing while run it using InitScript "invalid option "- E: Invalid operation update E: Unable to locate package msodbcsql17 E: Command line option ' ' [from -y ] is not understood in combination with the other options.Indifferent
This works; to get the right version, you can run this first. %sh cat /etc/*release As you almost certainly are not using ubuntu 16.04 on any Databricks system.Adulterant
B
2

Run the following command in a cell to install SQL ODBC Driver on Azure Databricks cluster.

%sh
if ! [[ "16.04 18.04 20.04 22.04" == *"$(lsb_release -rs)"* ]];
then
    echo "Ubuntu $(lsb_release -rs) is not currently supported.";
    exit;
fi

curl https://packages.microsoft.com/keys/microsoft.asc | sudo tee /etc/apt/trusted.gpg.d/microsoft.asc

curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list | sudo tee /etc/apt/sources.list.d/mssql-release.list

sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install -y msodbcsql17
# optional: for bcp and sqlcmd
sudo ACCEPT_EULA=Y apt-get install -y mssql-tools
echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
source ~/.bashrc
# optional: for unixODBC development headers
sudo apt-get install -y unixodbc-dev

Reference: Installing Microsoft ODBC Driver for SQL Server

Bartie answered 20/2 at 13:37 Comment(1)
This worked for me.. I was struggling after upgrading to 14.1 DBRMephistopheles
S
1

Instead of using the ODBC driver why don't you use the spark driver of Azure Synapse (aka SQL Data warehouse), databricks clusters have this driver installed by default ( com.databricks.spark.sqldw" ) .

Documentation : https://docs.databricks.com/data/data-sources/azure/synapse-analytics.html#language-python

Example of use :

df = spark.read \
.format("com.databricks.spark.sqldw") \
.option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>") \
.option("tempDir", "wasbs://<your-container-name>@<your-storage-account- 
name>.blob.core.windows.net/<your-directory-name>") \
.option("forwardSparkAzureStorageCredentials", "true") \
.option("dbTable", "my_table_in_dw") \
.load()
Slowly answered 28/4, 2020 at 0:50 Comment(3)
If my answer is helpful for you, you can accept it as answer( click on the check mark beside the answer to toggle it from greyed out to filled in.). This can be beneficial to other community members. Thank you.Houseman
Thank you very much for your advice, I could succesfully install odbc driver on db cluster but, as you said, is recommended to use spark driver for most of the operations with Azure Synapse.Paganini
The odbc driver is required to execute stored proceduresVoluptuous
B
0

I am trying to install the driver in my global init scripts for azure databricks runtime 14.1 lts and the cluster creation fails with error as under . The same script runs fine for databricks runtime 12.2LTS

Error message : 'Failed to add 1 container to the compute. Will attempt retry: false. Reason: Global init script failure

Script used
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - curl https://packages.microsoft.com/config/ubuntu/20.04/prod.list > /etc/apt/sources.list.d/mssql-release.list sudo apt-get update sudo ACCEPT_EULA=Y apt-get -q -y install msodbcsql17#

Buckeen answered 8/4 at 12:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.