Databricks CLI: SSLError, can't find local issuer certificate
Asked Answered
P

3

8

I have installed and configured the Databricks CLI, but when I try using it I get an error indicating that it can't find a local issuer certificate:

$ dbfs ls dbfs:/databricks/cluster_init/
Error: SSLError: HTTPSConnectionPool(host='dbc-12345678-1234.cloud.databricks.com', port=443): Max retries exceeded with url: /api/2.0/dbfs/list?path=dbfs%3A%2Fda
tabricks%2Fcluster_init%2F (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer
 certificate (_ssl.c:1123)')))

Does the above error indicate that I need to install a certificate, or somehow configure my environment so that it knows how to find the correct certificate?

My environment is Windows 10 with WSL (Ubuntu 20.04) (the command above is from WSL/Ubuntu command line).

The Databricks CLI was installed into an Anaconda environment including the following certificates and SSL packages:

$ conda list | grep cert
ca-certificates           2020.6.20            hecda079_0    conda-forge
certifi                   2020.6.20        py38h32f6830_0    conda-forge
$ conda list | grep ssl
openssl                   1.1.1g               h516909a_1    conda-forge
pyopenssl                 19.1.0                     py_1    conda-forge

I get a similar error when I attept to use the REST API with curl:

$ curl -n -X GET https://dbc-12345678-1234.cloud.databricks.com/api/2.0/clusters/list
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
Palermo answered 14/9, 2020 at 16:6 Comment(0)
G
11

This problem can be solved by disabling the SSL certificate verification. In Databricks CLI you can do so by specifying insecure = True in your Databricks configuration file .databrickscfg.

Gujranwala answered 14/12, 2020 at 8:30 Comment(4)
Shouldn't you set insecure = True? It's funny that setting it as False works too :-). Not setting insecure gives the SSL error.Cotter
Exactly. You can see in their source code that they do not check the value of that argument, rather just check whether it contains any kind of value. If it does, they then set verify = False which is then used in their ApiClient for requests (that's why I by mistake wrote there insecure = False :)).Gujranwala
This seems to work, however this is a hot-fix solution (disabling SSL should never be a persistent solution). Does anyone know how to add the root certificate to be considered by dbfs?Muffle
fyi, this doesn't work with the newer databricks cliBauske
T
2

I established trust to my Databricks instance by setting the environment variable REQUESTS_CA_BUNDLE.

➜ databricks workspace list
Error: SSLError: HTTPSConnectionPool(host='HOSTNAME.azuredatabricks.net', port=443): Max retries exceeded with url: /api/2.0/workspace/list?path=%2F (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1129)')))
➜ export REQUESTS_CA_BUNDLE=path/to/ca-bundle
➜ databricks workspace list
Users
Shared
Repos

From GitHub Issue:

Download the root CA certificate used to sign the Databricks certificate. Determine the path to the CA bundle and set the environment variable REQUESTS_CA_BUNDLE. See SSL Cert Verification for more information.

Tetracaine answered 8/9, 2022 at 13:51 Comment(2)
The REQUESTS_CA_BUNDLE is set with the cluster initialization, right? If I'm not the one setting up the cluster, can I still do this? Where is export REQUESTS_CA_BUNDLE=path/to/ca-bundle run?Eladiaelaeoptene
Are you refering to this article? Set REQUESTS_CA_BUNDLE on the compute cluster if you need to establish trust from Databricks to external endpoints with custom CA. The way I interpreted the original question is that we want to establish trust from an external client running Databricks CLI to the Databricks host with custom CA. In my case this would be my local computer.Tetracaine
M
2

There is a similar issue in GitHub for Azure CLI. The solution is practically the same. Combining that with the Erik's answer:

  1. Download the certificate using your browser and save it to disk

    • Open you Chrome and go to the Databricks website
    • Press CTRL + SHIFT + I to open the dev tools
    • Click Security tab
    • Click View certificate button
    • Click Details tab
    • On the Certification Hierarchy, (the top panel), click the highest node in the tree
    • Click Export the selected certificate
    • Choose where you want to save (eg. /home/cert/certificate.crt)
  2. Use the SET command on Windows or the export on Linux to create a env variable called REQUESTS_CA_BUNDLE and point it to the downloaded file in the Step 1. (keep in mind that this need to be done in the same machine as you are trying to use the dbfs not in the cluster) For instance:

    Linux

    export REQUESTS_CA_BUNDLE=/home/cert/certificate.crt
    

    Windows

    set REQUESTS_CA_BUNDLE=c:\temp\cert\certificate.crt
    
  3. Try to run your command dbfs ls dbfs:/databricks/cluster_init/ again

    $ dbfs ls dbfs:/databricks/cluster_init/
    

It should work!

Myke answered 29/9, 2022 at 14:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.