Copying storage data from one Azure account to another
Asked Answered
D

10

26

I would like to copy a very large storage container from one Azure storage account into another (which also happens to be in another subscription).

I would like an opinion on the following options:

  1. Write a tool that would connect to both storage accounts and copy blobs one at a time using CloudBlob's DownloadToStream() and UploadFromStream(). This seems to be the worst option because it will incur costs when transferring the data and also be quite slow because data will have to come down to the machine running the tool and then get re-uploaded back to Azure.

  2. Write a worker role to do the same - this should theoretically be faster and not incur any cost. However, this is more work.

  3. Upload the tool to a running instance bypassing the worker role deployment and pray the tool finishes before the instance gets recycled/reset.

  4. Use an existing tool - have not found anything interesting.

Any suggestions on the approach?

Update: I just found out that this functionality has finally been introduced (REST APIs only for now) for all storage accounts created on July 7th, 2012 or later:

http://msdn.microsoft.com/en-us/library/windowsazure/dd894037.aspx

Debark answered 20/12, 2011 at 21:19 Comment(2)
Try the Azure Storage Synctool.Succussion
Azure Storage Synctool is a bit raw - only supports Storage-to-local (meaning that I'd need to do it in two steps, first download my entire container and then re-upload it, which is not a big deal) and mostly, it does not resume - that could be a bit of a problem. The homegrown solution we've ended up building (it was really easy) supports resuming, does storage-to-storage and uses CopyFromBlob if doing it on the same account.Debark
B
10

Since there's no direct way to migrate data from one storage account to another, you'd need to do something like what you were thinking. If this is within the same data center, option #2 is the best bet, and will be the fastest (especially if you use an XL instance, giving you more network bandwidth).

As far as complexity, it's no more difficult to create this code in a worker role than it would be with a local application. Just run this code from your worker role's Run() method.

To make things more robust, you could list the blobs in your containers, then place specific file-move request messages into an Azure queue (and optimize by putting more than one object name per message). Then use a worker role thread to read from the queue and process objects. Even if your role is recycled, at worst you'd reprocess one message. For performance increase, you could then scale to multiple worker role instances. Once the transfer is complete, you simply tear down the deployment.

UPDATE - On June 12, 2012, the Windows Azure Storage API was updated, and now allows cross-account blob copy. See this blog post for all the details.

Bryanbryana answered 20/12, 2011 at 21:31 Comment(2)
One interesting thing to note is that if the containers are on the same storage account (which is unfortunately not the case for me) then there is a way to copy the blobs without passing by the client - not sure which API does it, but running Fiddler while copying with Azure Storage Explorer shows the x-ms-copy-source header.Debark
Ok, it's called "CopyFromBlob". One could check if the source account matches the destination and use that method, otherwise use the DownloadToStream/UploadToStream combo.Debark
O
11

You can also use AzCopy that is part of the Azure SDK.

Just click the download button for Windows Azure SDK and choose WindowsAzureStorageTools.msi from the list to download AzCopy.

After installing, you'll find AzCopy.exe here: %PROGRAMFILES(X86)%\Microsoft SDKs\Windows Azure\AzCopy

You can get more information on using AzCopy in this blog post: AzCopy – Using Cross Account Copy Blob

As well, you could remote desktop into an instance and use this utility for the transfer.

Update:

You can also copy blob data between storage accounts using Microsoft Azure Storage Explorer as well. Reference link

Ophiology answered 19/6, 2013 at 9:16 Comment(0)
B
10

Since there's no direct way to migrate data from one storage account to another, you'd need to do something like what you were thinking. If this is within the same data center, option #2 is the best bet, and will be the fastest (especially if you use an XL instance, giving you more network bandwidth).

As far as complexity, it's no more difficult to create this code in a worker role than it would be with a local application. Just run this code from your worker role's Run() method.

To make things more robust, you could list the blobs in your containers, then place specific file-move request messages into an Azure queue (and optimize by putting more than one object name per message). Then use a worker role thread to read from the queue and process objects. Even if your role is recycled, at worst you'd reprocess one message. For performance increase, you could then scale to multiple worker role instances. Once the transfer is complete, you simply tear down the deployment.

UPDATE - On June 12, 2012, the Windows Azure Storage API was updated, and now allows cross-account blob copy. See this blog post for all the details.

Bryanbryana answered 20/12, 2011 at 21:31 Comment(2)
One interesting thing to note is that if the containers are on the same storage account (which is unfortunately not the case for me) then there is a way to copy the blobs without passing by the client - not sure which API does it, but running Fiddler while copying with Azure Storage Explorer shows the x-ms-copy-source header.Debark
Ok, it's called "CopyFromBlob". One could check if the source account matches the destination and use that method, otherwise use the DownloadToStream/UploadToStream combo.Debark
G
6

here is some code that leverages the .NET SDK for Azure available at http://www.windowsazure.com/en-us/develop/net

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.WindowsAzure.StorageClient;
using System.IO;
using System.Net;

namespace benjguinAzureStorageTool
{
    class Program
    {
        private static Context context = new Context();

        static void Main(string[] args)
        {
            try
            {
                string usage = string.Format("Possible Usages:\n"
                + "benjguinAzureStorageTool CopyContainer account1SourceContainer account2SourceContainer account1Name account1Key account2Name account2Key\n"
                );


                if (args.Length < 1)
                    throw new ApplicationException(usage);

                int p = 1;

                switch (args[0])
                {
                    case "CopyContainer":
                        if (args.Length != 7) throw new ApplicationException(usage);
                        context.Storage1Container = args[p++];
                        context.Storage2Container = args[p++];
                        context.Storage1Name = args[p++];
                        context.Storage1Key = args[p++];
                        context.Storage2Name = args[p++];
                        context.Storage2Key = args[p++];

                        CopyContainer();
                        break;


                    default:
                        throw new ApplicationException(usage);
                }

                Console.BackgroundColor = ConsoleColor.Black;
                Console.ForegroundColor = ConsoleColor.Yellow;
                Console.WriteLine("OK");
                Console.ResetColor();
            }
            catch (Exception ex)
            {
                Console.WriteLine();
                Console.BackgroundColor = ConsoleColor.Black;
                Console.ForegroundColor = ConsoleColor.Yellow;
                Console.WriteLine("Exception: {0}", ex.Message);
                Console.ResetColor();
                Console.WriteLine("Details: {0}", ex);
            }
        }


        private static void CopyContainer()
        {
            CloudBlobContainer container1Reference = context.CloudBlobClient1.GetContainerReference(context.Storage1Container);
            CloudBlobContainer container2Reference = context.CloudBlobClient2.GetContainerReference(context.Storage2Container);
            if (container2Reference.CreateIfNotExist())
            {
                Console.WriteLine("Created destination container {0}. Permissions will also be copied.", context.Storage2Container);
                container2Reference.SetPermissions(container1Reference.GetPermissions());
            }
            else
            {
                Console.WriteLine("destination container {0} already exists. Permissions won't be changed.", context.Storage2Container);
            }


            foreach (var b in container1Reference.ListBlobs(
                new BlobRequestOptions(context.DefaultBlobRequestOptions)
                { UseFlatBlobListing = true, BlobListingDetails = BlobListingDetails.All }))
            {
                var sourceBlobReference = context.CloudBlobClient1.GetBlobReference(b.Uri.AbsoluteUri);
                var targetBlobReference = container2Reference.GetBlobReference(sourceBlobReference.Name);

                Console.WriteLine("Copying {0}\n to\n{1}",
                    sourceBlobReference.Uri.AbsoluteUri,
                    targetBlobReference.Uri.AbsoluteUri);

                using (Stream targetStream = targetBlobReference.OpenWrite(context.DefaultBlobRequestOptions))
                {
                    sourceBlobReference.DownloadToStream(targetStream, context.DefaultBlobRequestOptions);
                }
            }
        }
    }
}
Germ answered 24/12, 2011 at 11:15 Comment(1)
Thanks! I had the solution coded up already, but it's always nice to see some code shared out. Plus, I was creating a MemoryStream - it's cleaner (and more efficient) to do an OpenWrite() on the target blob.Debark
W
4

Its very simple with AzCopy. Download latest version from https://azure.microsoft.com/en-us/documentation/articles/storage-use-azcopy/ and in azcopy type: Copy a blob within a storage account:

AzCopy /Source:https://myaccount.blob.core.windows.net/mycontainer1 /Dest:https://myaccount.blob.core.windows.net/mycontainer2 /SourceKey:key /DestKey:key /Pattern:abc.txt

Copy a blob across storage accounts:

AzCopy /Source:https://sourceaccount.blob.core.windows.net/mycontainer1 /Dest:https://destaccount.blob.core.windows.net/mycontainer2 /SourceKey:key1 /DestKey:key2 /Pattern:abc.txt

Copy a blob from the secondary region

If your storage account has read-access geo-redundant storage enabled, then you can copy data from the secondary region.

Copy a blob to the primary account from the secondary:

AzCopy /Source:https://myaccount1-secondary.blob.core.windows.net/mynewcontainer1 /Dest:https://myaccount2.blob.core.windows.net/mynewcontainer2 /SourceKey:key1 /DestKey:key2 /Pattern:abc.txt
Weakfish answered 16/11, 2015 at 9:23 Comment(0)
H
2

I'm a Microsoft Technical Evangelist and I have developed a sample and free tool (no support/no guarantee) to help in these scenarios.

The binaries and source-code are available here: https://blobtransferutility.codeplex.com/

The Blob Transfer Utility is a GUI tool to upload and download thousands of small/large files to/from Windows Azure Blob Storage.

Features:

  • Create batches to upload/download
  • Set the Content-Type
  • Transfer files in parallel
  • Split large files in smaller parts that are transferred in parallel

The 1st and 3rd feature is the answer to your problem.

You can learn from the sample code how I did it, or you can simply run the tool and do what you need to do.

Hansom answered 11/3, 2013 at 15:4 Comment(0)
M
1

Write your tool as a simple .NET Command Line or Win Forms application.

Create and deploy a dummy we/worker role with RDP enabled

Login to the machine via RDP

Copy your tool over the RDP connection

Run the tool on the remote machine

Delete the deployed role.

Like you I am not aware of any of the off the shelf tools supporting a copy between function. You may like to consider just installing Cloud Storage Studio into the role though and dumping to disk then re-uploading. http://cerebrata.com/Products/CloudStorageStudiov2/Details.aspx?t1=0&t2=7

Murdock answered 20/12, 2011 at 21:33 Comment(3)
Not sure why you'd propose RDP. Aside from manual operation being required, you'd need to deploy your WinForms app with your deployment. A simple worker role running a straightforward task is simpler and can scale to multiple instances for faster performance.Bryanbryana
Original poster specifically noted that he'd find it quicket to write a 'tool' than to write a worker role. Agree that I'd usually write a worker to do this but in this case I tried to provide an answer that paid heed to that constraint. You wouldn't need to deploy the WInForms app; literally you'd just drag it across the RDP link as you would connecting to and working with an on-premise server.Murdock
Actually, I ended up doing pretty much that - copy-paste my WinForm into a running web role and run it from there. Even though it ran there, it was still not lightning-fast: it took about 22 hours to copy some 400,000 items (mostly medium-sized jpegs) - 46.5GB worth of data - I would have expected it to be much faster (no more than 5 hours).Debark
C
1

Use could 'Azure Storage Explorer' (free) or some other such tool. These tools provide a way to download and upload content. You will need to manually create containers and tables - and of course this will incur a transfer cost - but if you are short on time and your contents are of reasonable size then this is a viable option.

Crackdown answered 9/1, 2012 at 23:10 Comment(1)
As originally stated, the container that I had to copy was very large - Azure Storage Explorer does not fare well in that scenario. I ended up writing a custom tool - it was very straightforward - and I ran it from the Azure web role to avoid the transfer cost.Debark
P
1

I recommend use azcopy, you can copy the all the storage account, a container, a directory or a single blob. Here al example of cloning all the storage account:

azcopy copy 'https://{SOURCE_ACCOUNT}.blob.core.windows.net{SOURCE_SAS_TOKEN}' 'https://{DESTINATION_ACCOUNT}.blob.core.windows.net{DESTINATION_SAS_TOKEN}' --recursive

You can get SAS token from Azure Portal. Navigate to storage account overviews (source and destination), then in the sidenav click on "Shared access sigantura" and generate your own.

More examples here

Paley answered 5/5, 2021 at 18:4 Comment(0)
A
0

I had to do somethign similar to move 600 GB of content from a local file system to Azure Storage. After a couple iterations of code I finally ended up with taking the 'Azure Storage Explorer' and extended it with ability to select folders instead of just files and then have it recursively drill into the multiple selected folders, loaded a list of Source / Destination copy item statements into an Azure Queue. Then in the upload section in 'Azure Storage Explorer', in the Queue section to pull from the queue and execute the copy operation.

Then I launched like 10 instances of the 'Azure Storage Explorer' tool and had each pulling from the queue and executing the copy operation. I was able to move the 600 GB of items in just over 2 days. Added in smarts to utilize the modified time stamps on files and have it skip over files that have already been both copied from the queue and not add to the queue if it is in sync. Now I can run "updates" or syncs within an hour or two across the whole library of content.

Antisana answered 13/7, 2012 at 0:7 Comment(0)
B
-1

Try CloudBerry Explorer. It copies blob within and between subscriptions.

For copying between subscriptions, edit the storage account container's access from Private to Public Blob.

The copying process took few hours to complete. If you choose to reboot your machine, the process will continue. Check status by refreshing the target storage account container in Azure management UI by checking the timestamp, the value gets updated until the copy process completes.

enter image description here

Bainbrudge answered 27/6, 2016 at 3:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.