Trouble with Azure Synapse: pipeline cannot execute a stored procedure that works in Develop script
R

1

1

I will give some context regarding our inconvenience in Azure Synapse:

  • We created a Stored Procedure (it creates a VIEW which reads all the parquet files in a certain folder) on a Develop script, and it ran successfully.
  • We also created the VIEW manually, also successfully, in a database created in Serverless SQL Pool.
  • The container where the data is currently has Private Access Level. My user has Storage blob data contributor.
  • Moving on to the Azure Synapse pipeline, we can use Copy Data to get new parquet files inserted into the Container (ADLS Gen2).
  • When we want to run everything on Integrate (Synapse pipeline) an error pops up:

{ "errorCode": "2402", "message": "Execution fail against sql server. Please contact SQL Server team if you need further support. Sql error number: 13807. Error Message: Content of directory on path 'https://xxx.blob.core.windows.net/data/folder/*.parquet' cannot be listed.

If we switch the Container's Access level to public, everything works smoothly, but we want to keep it Private.

Is there anything else we should do in order to make our Synapse pipeline work correctly? Any additional permissions setup or else?

Thank you so much in advance.

Regards,

Mateo

Recondite answered 19/1, 2023 at 19:19 Comment(0)
S
0

I tried to reproduce the issue and got similar error.

enter image description here

As per Microsoft documentation the error it says that the user who's querying Azure Data Lake can't list the files in storage.

  • The user of Azure AD who is utilizing pass-through authentication from Azure AD is not authorized to show the files in Data Lake Storage.
  • The shared access signature key or workspace managed identity being used by the Azure AD or SQL user to view data does not have authorization to list the files in storage.

Give Storage Blob Data Contributor role to your synapse workspace.

Go to Storage account => Access Control (IAM) => Add role assignment => Select Role: Storage Blob Data Contributor Select: your workspace name => Click on save.

OR

To Get the data from Private containers you need to give authorize access of it to data DataSource when using it from pipeline. for this you can use Access key, Shared access credentials to give this credential to DataSource you need to create scoped credential. But you can't create Scoped credentials in Serverless SQL pool you have to use Dedicated SQL pool.

The work around is to use Dedicated SQL pool.

CREATE  DATABASE  SCOPED  CREDENTIAL SasToken
WITH  IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = 'SAS token';
GO

CREATE  EXTERNAL  DATA  SOURCE mysample1
WITH ( LOCATION = 'storage account',
CREDENTIAL = SasToken
)

now you can create stored procedure to create view and then execute it from stored procedure or scrip activity in pipeline.

Standup answered 20/1, 2023 at 9:33 Comment(1)
Pratik, thank you very much for your time. What you proposed worked smoothly, and we will definitely take into account the Dedicated SQL Pool workaround you mentioned at the end of your answer. Kind regards, MateoRecondite

© 2022 - 2024 — McMap. All rights reserved.