How to store audit data in Azure
Asked Answered
S

3

7

We're in the design phase for building an audit trail in an existing web application. The application runs on Windows Azure and uses a SQL Azure database.

The audit logs must be filtered by user, or by object type (eg. show all action of a user, or show all actions that are performed on a object).

We have to choose how to store the data, should we use SQL Azure, or should we use table storage? We prefer table storage (cheaper)..

however the 'problem' with table storage is how to define the partition key. We have several thousand customers (the appplication users) in our SQL database, each in their own tenant. Using the tenant ID as partition key is not specific enough, so we have to add something to the partition key. So there's the issue: given the requirements for filtering, we can add a user ID to the partition key to make filtering by user easy, or we can add an object ID to make filtering by object easy.

So we see two possible solutions:
- use SQL Azure instead of table storage
- use table storage and use two tables with different partition keys, which means we duplicate all entries

Any ideas what's the best approach for our situation? Are there other, better solutions?

Superimpose answered 1/8, 2016 at 8:23 Comment(3)
are you trying to audit only SQL database activities like DDL,DML..or entire user interactionsAtli
We want to store certain events (CRUD operations), these events can occur on SQL data, but also on blob data (we have a lot of serialized data in blobs)Superimpose
If all of your audit "records" would fit in the size criteria in Table Storage (1 MB), then I would assume it's almost preference and how you want to be able to access them. I would personally go for SQL Azure if the team is familiar with SQL/SQL Server and keep in mind handling Concurrency with Table StorageSlippy
S
2

DocumentDB on Azure might be worth considering. https://azure.microsoft.com/en-us/documentation/articles/documentdb-use-cases/ You can have audit trail stored in DocDB as JSON documents (user, activity, object fields and can index on all fields )

Schatz answered 2/8, 2016 at 0:4 Comment(2)
Thanks for the suggestions, we will further investigate this approachSuperimpose
Marked as answer because we decided to use DocumentDBSuperimpose
A
0

Azure Table Storage is appropriate to store log data. As Azure App services use Azure Table Storage to store the diagnosis logs.

In think you can consider to set the PartitionKey as your user's tenant name, and the RowKey is the user's ID. As according the Table Storage Data Model, we only need to keep:

Together the PartitionKey and RowKey uniquely identify every entity within a table

Alternatively, you can clarify your concern about:

Using the tenant ID as partition key is not specific enough, so we have to add something to the partition key

Additionally, you can refer https://azure.microsoft.com/en-us/documentation/articles/storage-table-design-guide/#overview for more info about design Azure Table Storage.

Any update, please feel free to let me know.

Advocacy answered 2/8, 2016 at 8:6 Comment(2)
Maybe I don't understand it correctly, but if we would use Tenant ID for partition key and User ID for row key, we can only store one record per user? We would need to store many records per user (basically all actions that a user performs) so that wouldn't work, right?Superimpose
Yes, you are right. You can try to use Tenant ID (name) for partition key, and user ID for a custom key, and you can generate a uuid as row key for the log entity id.Advocacy
T
0

If you're worried about filtering in multiple ways - you could always write the same data to multiple partitions. It works really well. For example, in our app we have Staff and Customers. When there is an interaction we want to track/trace that applied to both of them (perhaps an over the phone Purchase), we will write the same information (typically as json) to our audit tables.

{
     PurchaseId: 9485,
     CustomerId: 138,
     StaffId: 509,
     ProductId: 707958,
     Quantity: 20
     Price: 31.99,
     Date: '2017-08-15 15:48:39'
 }

And we will write that same row to the following partitions: Product_707958, Customer_138, Staff_509. The row key is the same across the three rows in each partition: Purchase_9485. Now if I want to go and query everything that has happened for a given staff, customer, or item, I just grab the whole partition. The storage is dirt cheap, so who cares if you write it to multiple places?

Also, an idea for you considering you have multiple tenants - you could make the table name Tenant_[SomeId]. There are some other issues you might have to deal with, but it is in a sense another key to get at schema-less data.

Timothee answered 15/8, 2017 at 13:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.