Checking if a blob exists in Azure Storage
Asked Answered
M

13

159

I've got a very simple question (I hope!) - I just want to find out if a blob (with a name I've defined) exists in a particular container. I'll be downloading it if it does exist, and if it doesn't then I'll do something else.

I've done some searching on the intertubes and apparently there used to be a function called DoesExist or something similar... but as with so many of the Azure APIs, this no longer seems to be there (or if it is, has a very cleverly disguised name).

Mintz answered 15/4, 2010 at 5:23 Comment(1)
Thanks everyone. As I'm using StorageClient (and would prefer to keep all my Azure Storage access going through that library) I went with the FetchAttributes-and-check-for-exceptions method that smarx suggested. It does 'feel' a bit off, in that I don't like having exceptions thrown as a normal part of my business logic - but hopefully this can be fixed in a future StorageClient version :)Mintz
D
234

The new API has the .Exists() function call. Just make sure that you use the GetBlockBlobReference, which doesn't perform the call to the server. It makes the function as easy as:

public static bool BlobExistsOnCloud(CloudBlobClient client, 
    string containerName, string key)
{
     return client.GetContainerReference(containerName)
                  .GetBlockBlobReference(key)
                  .Exists();  
}
Dentiform answered 10/5, 2013 at 14:58 Comment(12)
Is there .. a ... python version?Highborn
@MyName Well, python is not my language of choice, so I'm not familiar with it. However, looking at the open source SDK for Python, it looks like you could use the list_blobs function and give it the "prefix" that is the entire filename. Granted, you will have to protect yourself from cases such as "File1" and "File11" (where the blob name is the prefix of another blob). But that's an option I see available.Dentiform
@Dentiform Ah interesting, thank you for the suggestion, I'll give it a shot when the next opportunity arises. I actually ended up doing try {} catch {} in python (try: except: pass) which I feel isn't right stilll :(Highborn
Wonder what you get charged for checking blob exists? This defo seems like a better way to go than attempting to download the blob.Madelenemadelin
@anpatel, python version:len(blob_service.list_blobs(container_name, file_name)) > 0Roundup
If it doesn't make the call to the server, how does it know for sure the blob exists / not?Penstock
.Exists() calls off to the server. GetBlockBlobReference() does not.Dentiform
you may update your answer with which nuget package should be installedIncuse
NOTE: As of Microsoft.WindowsAzure.Storage version 8.1.4.0 (.Net Framework v4.6.2) the Exists() method doesn't exist in favour of ExistsAsync() Which is the version that will install for .NetCore projectsSilicon
What happens here if the container doesn't exist?Minnaminnaminnie
Hi Thanks for your answer but just want to if javascript version of above check is there or notMikkanen
It should be noted that this call to Exists() might throw a StorageException if you do not have permission to do something on the container level, even though working with the CloudBlobReference to retrieve or upload blobs is perfectly possible.Secrete
T
51

Note: This answer is out of date now. Please see Richard's answer for an easy way to check for existence

No, you're not missing something simple... we did a good job of hiding this method in the new StorageClient library. :)

I just wrote a blog post to answer your question: http://blog.smarx.com/posts/testing-existence-of-a-windows-azure-blob.

The short answer is: use CloudBlob.FetchAttributes(), which does a HEAD request against the blob.

Textbook answered 16/4, 2010 at 5:25 Comment(5)
FetchAttributes() takes a long time to run (in development storage at least) if the file hasn't been fully committed yet, i.e. just consists of uncommitted blocks.Ja
If you are going to fetch the blob anyway like the OP intends to do, why not try and download the content right away? If it's not there it will throw just like FetchAttributes. Doing this check first is just an extra request, or am I missing something?Pluviometer
Marnix makes an excellent point. If you're going to download it anyway, just try to download it.Textbook
@Marnix: If you call something like OpenRead it won't throw or return an empty Stream or anything like that. You'll only get errors when you start downloading from it. It's a lot easier to handle this all in one place :)Impi
@Porges: designing cloud application is all about "design for failure". There are lot of discussions how to properly handle this situation. But in general - I would also just go and download it, then handle the missing Blob errors. Not only that, but If I'm going to check for existance every blob I'm increasing the number of storage transactions, thus my bill. You can still have one place for handling Exceptions / Errors.Wristlet
S
17

Seem lame that you need to catch an exception to test it the blob exists.

public static bool Exists(this CloudBlob blob)
{
    try
    {
        blob.FetchAttributes();
        return true;
    }
    catch (StorageClientException e)
    {
        if (e.ErrorCode == StorageErrorCode.ResourceNotFound)
        {
            return false;
        }
        else
        {
            throw;
        }
    }
}
Seller answered 4/5, 2010 at 21:10 Comment(0)
H
10

If the blob is public you can, of course, just send an HTTP HEAD request -- from any of the zillions of languages/environments/platforms that know how do that -- and check the response.

The core Azure APIs are RESTful XML-based HTTP interfaces. The StorageClient library is one of many possible wrappers around them. Here's another that Sriram Krishnan did in Python:

http://www.sriramkrishnan.com/blog/2008/11/python-wrapper-for-windows-azure.html

It also shows how to authenticate at the HTTP level.

I've done a similar thing for myself in C#, because I prefer to see Azure through the lens of HTTP/REST rather than through the lens of the StorageClient library. For quite a while I hadn't even bothered to implement an ExistsBlob method. All my blobs were public, and it was trivial to do HTTP HEAD.

Hyps answered 17/4, 2010 at 16:49 Comment(0)
V
8

Here's a different solution if you don't like the other solutions:

I am using version 12.4.1 of the Azure.Storage.Blobs NuGet Package.

I get an Azure.Pageable object which is a list of all of the blobs in a container. I then check if the name of the BlobItem equals to the Name property of each blob inside the container utilizing LINQ. (If everything is valid, of course)

using Azure.Storage.Blobs;
using Azure.Storage.Blobs.Models;
using System.Linq;
using System.Text.RegularExpressions;

public class AzureBlobStorage
{
    private BlobServiceClient _blobServiceClient;

    public AzureBlobStorage(string connectionString)
    {
        this.ConnectionString = connectionString;
        _blobServiceClient = new BlobServiceClient(this.ConnectionString);
    }

    public bool IsContainerNameValid(string name)
    {
        return Regex.IsMatch(name, "^[a-z0-9](?!.*--)[a-z0-9-]{1,61}[a-z0-9]$", RegexOptions.Singleline | RegexOptions.CultureInvariant);
    }

    public bool ContainerExists(string name)
    {
        return (IsContainerNameValid(name) ? _blobServiceClient.GetBlobContainerClient(name).Exists() : false);
    }

    public Azure.Pageable<BlobItem> GetBlobs(string containerName, string prefix = null)
    {
        try
        {
            return (ContainerExists(containerName) ? 
                _blobServiceClient.GetBlobContainerClient(containerName).GetBlobs(BlobTraits.All, BlobStates.All, prefix, default(System.Threading.CancellationToken)) 
                : null);
        }
        catch
        {
            throw;
        }
    }

    public bool BlobExists(string containerName, string blobName)
    {
        try
        {
            return (from b in GetBlobs(containerName)
                     where b.Name == blobName
                     select b).FirstOrDefault() != null;
        }
        catch
        {
            throw;
        }
    }
}

Hopefully this helps someone in the future.

Viator answered 7/4, 2020 at 14:23 Comment(0)
O
6

The new Windows Azure Storage Library already contains the Exist() method. It´s in the Microsoft.WindowsAzure.Storage.dll.

Available as NuGet Package
Created by: Microsoft
Id: WindowsAzure.Storage
Version: 2.0.5.1

See also msdn

Oppidan answered 3/5, 2013 at 9:31 Comment(0)
V
4

This is the way I do it. Showing full code for those who need it.

        // Parse the connection string and return a reference to the storage account.
        CloudStorageAccount storageAccount = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("AzureBlobConnectionString"));

        CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();

        // Retrieve reference to a previously created container.
        CloudBlobContainer container = blobClient.GetContainerReference("ContainerName");

        // Retrieve reference to a blob named "test.csv"
        CloudBlockBlob blockBlob = container.GetBlockBlobReference("test.csv");

        if (blockBlob.Exists())
        {
          //Do your logic here.
        }
Vicinage answered 26/2, 2018 at 14:33 Comment(0)
D
3

If your blob is public and you need just metadata:

        HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
        request.Method = "HEAD";
        string code = "";
        try
        {
            HttpWebResponse response = (HttpWebResponse)request.GetResponse();
            code = response.StatusCode.ToString();
        }
        catch 
        {
        }

        return code; // if "OK" blob exists
Damiondamita answered 31/12, 2018 at 11:57 Comment(0)
B
3

With Azure Blob storage library v12, you can use BlobBaseClient.Exists()/BlobBaseClient.ExistsAsync()

Answered on another similar question: https://mcmap.net/q/152505/-check-if-file-exists-on-blob-storage-with-azure-functions

Bertiebertila answered 7/8, 2020 at 1:39 Comment(0)
G
3

Java version for the same ( using the new v12 SDK )

This uses the Shared Key Credential authorization (account access key)

public void downloadBlobIfExists(String accountName, String accountKey, String containerName, String blobName) {
    // create a storage client using creds
    StorageSharedKeyCredential credential = new StorageSharedKeyCredential(accountName, accountKey);
    String endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName);
    BlobServiceClient storageClient = new BlobServiceClientBuilder().credential(credential).endpoint(endpoint).buildClient();

    BlobContainerClient container = storageClient.getBlobContainerClient(containerName);
    BlobClient blob = container.getBlobClient(blobName);
    if (blob.exists()) {
        // download blob
    } else {
        // do something else
    }
}
Garlicky answered 4/9, 2020 at 8:47 Comment(2)
This only tests if a container existsFroward
@Froward Thank you for pointing that out. I seem to have pasted my answer for an incorrect question. I have updated the answer. Please take a look.Garlicky
G
2

If you don't like using the exception method then the basic c# version of what judell suggests is below. Beware though that you really ought to handle other possible responses too.

HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create(url);
myReq.Method = "HEAD";
HttpWebResponse myResp = (HttpWebResponse)myReq.GetResponse();
if (myResp.StatusCode == HttpStatusCode.OK)
{
    return true;
}
else
{
    return false;
}
Granddaughter answered 18/11, 2011 at 8:49 Comment(2)
HttpWebRequest.GetResponse throws an exception if there's a 404. So I don't see how your code would circumvent the need to handle exceptions?Kendakendal
Fair point. Seems rubbish to me that GetResponse() throws at that point! I would expect it to return the 404 as that is the response!!!Granddaughter
J
2

With the updated SDK, once you have the CloudBlobReference you can call Exists() on your reference.

See http://msdn.microsoft.com/en-us/library/microsoft.windowsazure.storage.blob.cloudblockblob.exists.aspx

Juvenal answered 5/8, 2013 at 22:16 Comment(0)
S
2

Although most answers here are technically correct, most code samples are making synchronous/blocking calls. Unless you're bound by a very old platform or code base, HTTP calls should always be done asynchonously, and the SDK fully supports it in this case. Just use ExistsAsync() instead of Exists().

bool exists = await client.GetContainerReference(containerName)
    .GetBlockBlobReference(key)
    .ExistsAsync();
Shaeffer answered 17/10, 2018 at 16:6 Comment(3)
You're correct, the old .Exists() is not the best option. However, while the old API is synchronous, using await causes ExistsAsync to also be synchronous. So, I would agree that HTTP calls should usually be asynchronous. But this code isn't that. Still, +1 for the new API!Dentiform
Thanks, but I couldn't disagree more. Exists() is synchronous in that it blocks a thread until it completes. await ExistsAscyn() is asynchronous in that it does not. Both follow the same logical flow in that the next line of code doesn't begin until the previous one is done, but it's the nonblocking nature of ExistsAsync that makes it asynchronous.Shaeffer
And... I've learned something new! :) softwareengineering.stackexchange.com/a/183583/38547Dentiform

© 2022 - 2024 — McMap. All rights reserved.