I am trying to connect to abfss directly(without mounting to DBFS) and trying to open json file using open() in databricks
Asked Answered
P

1

7

I am trying to connect to abfss directly(without mounting to DBFS) and trying to open json file using open() method in databricks.

json_file = open("abfss://@.dfs.core.windows.net/test.json') databricks is unable to open file present in azure blob container and getting below error : FileNotFoundError: [Errno 2] No such file or directory: 'abfss://@.dfs.core.windows.net/test.json'

I have done all the configuration setting using service principal. Please suggest other way of opening file using abfss direct path.

Poulter answered 9/4, 2021 at 15:30 Comment(0)
Q
7

the open method works only with local files - it doesn't know anything about abfss or other cloud storages. You have following choice:

  1. use dbutils.fs.cp to copy file from ADLS to local disk of driver node, and then work with it, like: dbutils.fs.cp("abfss:/....", "file:/tmp/my-copy")
  2. Copy file from ADLS to driver node using the Azure SDK

The first method is easier to use than second

Qualitative answered 9/4, 2021 at 16:16 Comment(2)
Thanks @Alex Ott, I will try to implement this.Poulter
This is such a great solution that is far better than using sas token urls to access files. I used this with openpyxl to extract sheets from xlsx files in synapes notebooks. Before your solution, I kept getting 'file not found' errors. openpyxl needs the local file system. Copying from ADLS to local is perfect for my solution. Thank you.Intricate

© 2022 - 2025 — McMap. All rights reserved.