Failing to upload larger blobs to Azure: azure.core.exceptions.ServiceRequestError: The operation did not complete (write) (_ssl.c:2317)
Asked Answered
A

1

6

I'm trying to upload some larger blobs (>50MB) to my Azure storage container using the Python SDK:

connect_str = os.environ['AZURE_STORAGE_CONNECTION_STRING']
blob_service_client = BlobServiceClient.from_connection_string(connect_str)

def upload_blob(file_path):
    if os.path.exists(file_path):
        with open(file_path, 'rb') as data:
            blob_client = blob_service_client.get_blob_client(container='foo', blob=file_path)

            print(f"Uploading file {file_path} to blob storage...")
            print(os.path.getsize(file_path))
            return blob_client.upload_blob(data, length=os.path.getsize(file_path))
    else:
        print(f"File {file_path} not found. Please store the file first before uploading")
        return False

When I run this however, I get a azure.core.exceptions.ServiceRequestError:

Traceback (most recent call last):
  File "C:/Users/.../storage_controller.py", line 96, in <module>
    upload_blob(config.VECTORIZER_PATH)
  File "C:/Users/.../storage_controller.py", line 34, in upload_blob
    return blob_client.upload_blob(data, length=os.path.getsize(file_path))
  File "C:\Users\...\venv\lib\site-packages\azure\core\tracing\decorator.py", line 83, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "C:\Users\...\venv\lib\site-packages\azure\storage\blob\_blob_client.py", line 496, in upload_blob
    return upload_block_blob(**options)
  File "C:\Users\...\venv\lib\site-packages\azure\storage\blob\_upload_helpers.py", line 104, in upload_block_blob
    **kwargs)
  File "C:\Users\...\venv\lib\site-packages\azure\storage\blob\_generated\operations\_block_blob_operations.py", line 207, in upload
    pipeline_response = self._client._pipeline.run(request, stream=False, **kwargs)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 211, in run
    return first_node.send(pipeline_request)  # type: ignore
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 71, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 71, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 71, in send
    response = self.next.send(request)
  [Previous line repeated 4 more times]
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\policies\_redirect.py", line 157, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 71, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\storage\blob\_shared\policies.py", line 515, in send
    raise err
  File "C:\Users\...\venv\lib\site-packages\azure\storage\blob\_shared\policies.py", line 489, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 71, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\storage\blob\_shared\policies.py", line 290, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 71, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 71, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 103, in send
    self._sender.send(request.http_request, **request.context.options),
  File "C:\Users\...\venv\lib\site-packages\azure\storage\blob\_shared\base_client.py", line 312, in send
    return self._transport.send(request, **kwargs)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\transport\_requests_basic.py", line 284, in send
    raise error
azure.core.exceptions.ServiceRequestError: The operation did not complete (write) (_ssl.c:2317)

I tried a couple of things, and I find some suggestions for chunking and using put_blob methods for handling larger files, but these solutions don't seem to be possible in the current version of the SDK which should handle the larger files by itself. Smaller files (e.g. .txt files with one line) work absolutely fine however. Is this an issue with the Azure SDK or is my own networking/SSL wrongly configured, and how could I resolve this?

Thanks in advance!

Aridatha answered 2/7, 2020 at 11:56 Comment(2)
you mention it might be your own networking, perhaps add some info on how thats set up, if it's not a direct connection (corporate LAN etc).Amontillado
If you want upload blob in chunk, please refer to github.com/Azure/azure-sdk-for-python/blob/…Deponent
D
12

I summarize the solution as below.

If you want to upload file to Azure blob in chunk with package azure.storage.blob, we can use the method BlobClient.stage_block to upload every chunk. After uploading, we use the method BlobClient.commit_block_list to make up all chunks as one blob.

For example

# Instantiate a new BlobServiceClient using a connection string
blob_service_client = BlobServiceClient.from_connection_string(connection_string)
# Instantiate a new ContainerClient
container_client = blob_service_client.get_container_client('')
blob_client = container_client.get_blob_client("csvfile.csv")
# upload data
block_list=[]
chunk_size=1024
with open('csvfile.csv','rb') as f:
   
   while True:
        read_data = f.read(chunk_size)
        if not read_data:
            break # done
        blk_id = str(uuid.uuid4())
        blob_client.stage_block(block_id=blk_id,data=read_data) 
        block_list.append(BlobBlock(block_id=blk_id))
        

blob_client.commit_block_list(block_list)

For more details, please refer to here

Deponent answered 7/7, 2020 at 2:2 Comment(2)
I submitted 2 edits. The break should be right after read_data otherwise it tries to add a blank object to the block stage. Reduced id creation to 1 line, otherwise you create a unique id when you stage, and a another unique / inconsistent id when you append to list. With my changes, the code works for me. Thanks for posting! Great answer!Duckling
Thanks for sharing this, most google searches are showing outdated versions of the SDK and won't work.Poco

© 2022 - 2024 — McMap. All rights reserved.