azure table storage Export Data to Flat or XML File for SQL
Asked Answered
C

4

5

I am looking for capability to Export data from SQL Azure Azure Table Storage to Some Flat file or XML file, so that we can import that to SQL / MYSQL Server and do the data analysis.

What happens is we get lot of transaction from our text API per hour, ( Partition Strategy is hourly), since Azure table storage doesn't support the count and Sum. So for data analysis puropose we want to export the data to Flat Files and import to SQL Server/MySQL to analyze it.

Question:

  1. Any util or Tutorial how to export data from Azure table storage to Flat file? Partition by Patition basis and fast? Any sample code or Utility? Any proven example / tutorial?

  2. MS SQL and Azure Table, any tool or utility which can automatically extract and load data to SQL sever every hour? Any such offering or export tool on road map by Microsoft?

Any support will be highly appreciated.

Collinear answered 9/6, 2013 at 12:0 Comment(0)
D
5

If you're looking for a tool to export data from Azure table storage to a flat file, may I suggest you take a look at Cerebrata's Azure Management Studio (Commercial, NOT Free) or ClumsyLeaf's TableXplorer (Commercial, NOT Free). Both of these tools have the capability to export data into CSV and XML file format.

Since both of the tools are GUI based, I don't think you can automate the export process. For automation, I would suggest you look into Cerebrata's Azure Management Cmdlets as it provides a PowerShell based interface to export data into CSV or XML format.

Since I was associated with Cerebrata in the past, I can only talk about that. The tool won't export on a partition-by-partition basis but if you know the all the PartitionKey values in your table, you can specify a query to export data for each partition.

If automation is one of the key requirement, you could simply write a console application which runs once per hour and extracts the data for the past hour. You could use .Net Storage Client library to fetch the data. To do so, first define a class which derives from TableEntity class. Something like below:

public class CustomEntity : TableEntity
{
    public string Attribute1
    {
        get;
        set;
    }

    public string Attribute2
    {
        get;
        set;
    }

    public string AttributeN
    {
        get;
        set;
    }

    public static string GetHeaders(string delimiter)
    {
        return "\"Attribute1\"" + delimiter + "\"Attribute2\"" + delimiter + "\"AttributeN\"";
    }

    public string ToDelimited(string delimiter)
    {
        return "\"" + Attribute1 + "\"" + delimiter + "\"" + Attribute2 + "\"" + delimiter + "\"" + AttributeN + "\"";
    }
}

Then your application could query the table storage on an hourly basis and save the data to a file:

    DateTime currentDateTime = DateTime.UtcNow;
    //Assuming the PartitionKey follows the following strategy for naming: YYYYMMDDHH0000
    var fromPartitionKey = currentDateTime.AddHours(-1).ToString("YYYYmmDDHH0000");
    var toPartitionKey = currentDateTime.ToString("YYYYmmDDHH0000");
    var filterExpression = string.Format("PartitionKey ge '{0}' and PartitionKey lt '{1}'", fromPartitionKey, toPartitionKey);
    var tableName = "<your table name>";
    var cloudStorageAccount = new CloudStorageAccount(new StorageCredentials("<account name>", "<account key>"), true);
    var cloudTableClient = cloudStorageAccount.CreateCloudTableClient();
    var table = cloudTableClient.GetTableReference(tableName);
    TableQuery<CustomEntity> query = new TableQuery<CustomEntity>()
    {
        FilterString = filterExpression,
    };

    var entities = table.ExecuteQuery<CustomEntity>(query).ToList();
    if (entities.Count > 0)
    {
        StringBuilder sb = new StringBuilder();
        sb.Append(CustomEntity.GetHeaders(",") + "\n");
        foreach (var entity in entities)
        {
            sb.Append(entity.ToDelimited(",") + "\n");
        }
        var fileContents = sb.ToString();
        //Now write this string to a file.
    }

As far as importing this data into a relational database, I'm pretty sure if you look around you'll find many utilities which will be able to do that.

Dunnock answered 9/6, 2013 at 14:44 Comment(0)
S
12

You can use Azure Storage Explorer. It is free and supported by Microsoft. Browse to the appropriate storage account, click on the table storage you want to export and look for the export option in the explorer. Table storage export

Sharitasharity answered 22/7, 2020 at 7:9 Comment(2)
As at 2020, this should be the accepted answer. I would also add that the Cloud Explorer in Visual Studio 2019 allows to "Export to CSV file" as well: open the storage table (double click from the Cloud Explorer) and then the option is available at the far right of the toolbar at the top of the table.Nervy
What about the performance? I am trying with the azcopy v7.x, but it seems like it gets stuck after <1h. The tables I want to export are on the range of 10-18M records. I know the Azure Storage Explorer uses the azcopy tool itself.Lariat
D
5

If you're looking for a tool to export data from Azure table storage to a flat file, may I suggest you take a look at Cerebrata's Azure Management Studio (Commercial, NOT Free) or ClumsyLeaf's TableXplorer (Commercial, NOT Free). Both of these tools have the capability to export data into CSV and XML file format.

Since both of the tools are GUI based, I don't think you can automate the export process. For automation, I would suggest you look into Cerebrata's Azure Management Cmdlets as it provides a PowerShell based interface to export data into CSV or XML format.

Since I was associated with Cerebrata in the past, I can only talk about that. The tool won't export on a partition-by-partition basis but if you know the all the PartitionKey values in your table, you can specify a query to export data for each partition.

If automation is one of the key requirement, you could simply write a console application which runs once per hour and extracts the data for the past hour. You could use .Net Storage Client library to fetch the data. To do so, first define a class which derives from TableEntity class. Something like below:

public class CustomEntity : TableEntity
{
    public string Attribute1
    {
        get;
        set;
    }

    public string Attribute2
    {
        get;
        set;
    }

    public string AttributeN
    {
        get;
        set;
    }

    public static string GetHeaders(string delimiter)
    {
        return "\"Attribute1\"" + delimiter + "\"Attribute2\"" + delimiter + "\"AttributeN\"";
    }

    public string ToDelimited(string delimiter)
    {
        return "\"" + Attribute1 + "\"" + delimiter + "\"" + Attribute2 + "\"" + delimiter + "\"" + AttributeN + "\"";
    }
}

Then your application could query the table storage on an hourly basis and save the data to a file:

    DateTime currentDateTime = DateTime.UtcNow;
    //Assuming the PartitionKey follows the following strategy for naming: YYYYMMDDHH0000
    var fromPartitionKey = currentDateTime.AddHours(-1).ToString("YYYYmmDDHH0000");
    var toPartitionKey = currentDateTime.ToString("YYYYmmDDHH0000");
    var filterExpression = string.Format("PartitionKey ge '{0}' and PartitionKey lt '{1}'", fromPartitionKey, toPartitionKey);
    var tableName = "<your table name>";
    var cloudStorageAccount = new CloudStorageAccount(new StorageCredentials("<account name>", "<account key>"), true);
    var cloudTableClient = cloudStorageAccount.CreateCloudTableClient();
    var table = cloudTableClient.GetTableReference(tableName);
    TableQuery<CustomEntity> query = new TableQuery<CustomEntity>()
    {
        FilterString = filterExpression,
    };

    var entities = table.ExecuteQuery<CustomEntity>(query).ToList();
    if (entities.Count > 0)
    {
        StringBuilder sb = new StringBuilder();
        sb.Append(CustomEntity.GetHeaders(",") + "\n");
        foreach (var entity in entities)
        {
            sb.Append(entity.ToDelimited(",") + "\n");
        }
        var fileContents = sb.ToString();
        //Now write this string to a file.
    }

As far as importing this data into a relational database, I'm pretty sure if you look around you'll find many utilities which will be able to do that.

Dunnock answered 9/6, 2013 at 14:44 Comment(0)
A
2

To update this thread with new Azure features, the free Microsoft maintained azcopy can now export Azure Table Storage to flat files in blob storage: https://azure.microsoft.com/en-us/documentation/articles/storage-use-azcopy/#copy-entities-in-an-azure-table-with-azcopy

Aestivate answered 11/12, 2015 at 12:16 Comment(0)
M
1

You may want to try Enzo Cloud Backup (http://bluesyntax.net/backup20.aspx). It provides the ability to directly export Azure Tables into a SQL database (SQL Server or SQL Azure). You can also call stored procedures to transform the data on the fly. And you can also put this on a schedule using the cloud agent, or the .NET API that comes with it. It's a commercial product, that comes with a free Community Edition, however this specific feature requires the advanced edition.

Murrelet answered 9/6, 2013 at 22:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.