How to partition Azure tables used for storing logs
Asked Answered
P

4

8

We have recently updated our logging to use Azure table storage, which owing to its low cost and high performance when querying by row and partition is highly suited to this purpose.

We are trying to follow the guidelines given in the document Designing a Scalable Partitioning Strategy for Azure Table Storage. As we are making a great number of inserts to this table (and hopefully an increasing number, as we scale) we need to ensure that we don't hit our limits resulting in logs being lost. We structured our design as follows:

  • We have a Azure storage account per environment (DEV, TEST, PROD).

  • We have a table per product.

  • We are using a TicksReversed+GUID for the Row Key, so that we can query blocks of results between certain times with a high performance.

  • We originally chose to partition the table by Logger, which for us were broad areas of the product such as API, Application, Performance and Caching. However, due to the low numbers of partitions we were concerned that this resulted in so-called "hot" partitions where many inserts were performed on one partition in a given time period. So we changed to partition on Context (for us, the class name or API resource).

However, in practice we have found this is less than ideal, because when we look at our logs at a glance we would like them to appear in order of time. We instead end up with blocks of results grouped by context, and we would have to get all partitions if we want to order them by time.

Some ideas we had were

  • use blocks of time (say 1 hour) for partition keys to order them by time (results in hot partitions for 1 hour)

  • use a few random GUIDs for partition keys to try to distribute the logs (we lose the ability to query quickly on features such as Context).

As this is such a common application of Azure table storage, there must be some sort of standard procedure. What is the best practice for partitioning Azure tables that are used for storing logs?

Solution constraints

  • Use cheap Azure storage (Table Storage seems the obvious choice)

  • Fast, scalable writes

  • Low chance of lost logs (i.e. by exceeding the partition write rate of 2000 entities per second in Azure table storage).

  • Reading ordered by date, most recent first.

  • If possible, to partition on something that would be useful to query (such as product area).

Parnell answered 24/4, 2015 at 8:34 Comment(1)
Ivan, could you share what decision have you made and how did you solve the issue?Malignity
S
7

I have come across similar situation you encountered, based on my experience I could say:

Whenever a query is fired on an azure storage table, it does a full table scan if a proper partition key is not provided. In other words, storage table is indexed on Partition key and partitioning the data properly is the key to get fast results.

That said, now you will have to think on what kind of queries you would fire on the table. Such as Logs occurred during a time period, for a product etc.

One way is to use reverse ticks up to hour precision instead of using the exact ticks as part of Partition Key. That way an hour worth of data can be queried based on this partition key. Depending on the number of rows which fall in to each partition, you could change the precision to a day. Also, it will be wise to store related data together, that means data for each product would go to a different table. That way you can reduce the number of partitions and number of rows in each partition.

Basically, ensure that you know the partition keys in advance (exact or range) and fire queries against such specific partition keys to get results faster.

To speed up writing to table, you can use Batch Operation. Be cautious though as if one entity on the batch fails whole batch operation fails. Proper retry and error checking can save you here.

At the same time, you could use blob storage to store lot of related data. The idea is to store a chunk of related serialized data as one blob. You can hit one such blob to get all the data in it and do further projections on the client side. For example, an hour worth of data for a product would go to a blob, you can devise a specific blob prefix naming pattern and hit the exact blob when needed. This will help you get your data pretty fast rather than doing a table scan for each query.

I used the blob approach and have been using it for couple of years with no troubles. I convert my collection to IList<IDictionary<string,string>> and use binary serialization and Gzip for storing each blob. I use Reflection.Emmit based helper methods to access entity properties pretty fast so serialization and deserialization doesn't take a toll on the CPU and memory.

Storing data in blobs help me store more for less and get my data faster.

Sennar answered 24/4, 2015 at 18:46 Comment(7)
Thanks for your answer Amit. The hour-precision approach is what I suggested as my first idea in the question, and the problem associated with this is that these hour-long partitions will be hot partitions for that period, and this risks lost logs if the number of writes hits the Azure maximum. You are correct that we could just adjust the size of these partitions as we scale, but it seems a little messy and a particularly hot hour could still bring down the system.Parnell
Hi Ivan, in that case you should consider second part of my answer, using Blob storage.Sennar
Blob storage would be great as it is even cheaper than Table storage, but how would this work if we have to add tens of logs per second to each of the various blobs? Maybe I misunderstand how blob storage works, but don't we need to serialize large chunks of data at once?Parnell
You are right, for storing a chunk of data on blob, you will need to collect a chunk of data somewhere else such as azure cache and periodically dumb them as blobs (you can not have it on the same VM as if it dies, you lose the uncommitted logs). This involves using multiple azure services and may not be ideal in all situations (considering the cost). This will be similar to the way azure diagnostics logs work, as you might have observed that they have a primary key with the precision of 1 min.Sennar
Archiving the logs as blobs is a possible solution, thanks for your suggestion. It would appear from these answers that table storage is actually not the silver bullet for logging it originally appeared to be.Parnell
well, Storage Table throttles requests after 2K IOPS per second per partition so ideally you will have to split it over different partitions.. this is seems to be only working solution as of now..! they would make changes to the services in coming releases though..Sennar
azure documentation says "Typically, you should use the Blob service instead of the Table service to store log data". They also suggested the same blob approach I mentioned above. More details here (look for Log data anti-pattern): azure.microsoft.com/en-us/documentation/articles/…Sennar
E
3

There is a very general trick to avoid hot spots when writing while at the same time increasing read costs a bit.

Define N partitions (like 10 or so). When writing a row stuff it into a random partition. Partitions can be sorted by time internally.

When reading you need to read from all N partitions (possibly filtered and ordered by time) and merge the query results.

This increases write scalability by a factor of N and increases query cost by the same number of round-trips and queries.

Also, you could consider storing logs somewhere else. The very tight artificial limits on Azure products cause labor costs that you otherwise would not have.

Choose N to be higher than needed to reach the 20,000 operations per second per account limit so that randomly occurring hotspots are unlikely. Picking N to be twice as high as minimally needed should be enough.

Eyeful answered 24/4, 2015 at 10:26 Comment(9)
Thanks for your answer usr. Yes, this is equivalent to the second suggestion I made in the question which was to use a few random GUIDs for partition keys. The failing of this is that we give up the powerful Azure functionality of querying by partition, and also like you point out when we reassemble the data to query it we end up having to query all the partitions. When you say consider storing logs somewhere else, where would you suggest? Logs and table storage seem to make an obvious as well as recommended pairing, so I'm not sure what would be better.Parnell
Any old SQL database has far more insert throughput than 2000 rows per second. If you use one of the well-known SQL Server fast insert techniques you can expect 100k rows per second per CPU core. So a tiny VM will suffice to bear all load. The querying and indexing capabilities are also unrestricted.Eyeful
That is true, but SQL is far too expensive for this purpose. This is the original reason we are migrating our logging from Azure SQL.Parnell
I like @Eyeful answer the most, but instead of doing a round-robin based on random partitions, why not use the hundredth of the milliseconds to be your natural selection? Your maximum upper limit for the storage account is 20 000 transactions per second, so the division by 10 is the maximum saturated hot spots that you could encounter.Larry
@Larry what would be the advantage of using the hundredth of the milliseconds as opposed to random selection?Eyeful
@Eyeful The randomness could bring in the hotspot because the other nodes are not in step. (E.g. multiple web servers sending logs to the storage account). If every node uses 0.Y00 (the Y part), then the maximum saturation point within that hundredth is 2000 operations. This would mean that the highest saturation point is 20k ops with the above logic... which is fine since storage accounts are capped at 20k ops per second. Maybe using the 10th part of the millisecond to span storage accounts. :)Larry
@Larry OK, the idea is nice. But Windows time has a resolution of 15ms most of the time. I added one paragraph about hotspots.Eyeful
@Larry except usr's answer is the same as the 2nd suggestion I gave in the OP. And it doesn't increase read costs "a bit", it multiplies them by the number of partitions created.Parnell
@Ivan true. It's the best you can I think. Here's another idea: If you want to query logs not only by time but also by product/context then store them multiple times. Sorted by time, by product, by context. Each time use the random distribution technique. This is manual secondary indexing. What a luxury to be on Azure...Eyeful
A
2

If I'm reading the question correctly, here are the solution constraints:

  • Use Table storage
  • High scale write
  • Separated by product area
  • Automatically ordered by time

There are several good solutions already presented, but I don't think there's an answer that satisfies all the constraints perfectly.

The solution that seems closest to satisfying your constraints was provided by usr. Divide your product area partitions into N, but don't use GUIDs, just use a number (ProductArea-5). Using GUIDs makes the querying problem much more difficult. If you use a number, you can query all of the partitions for a product area in a single query or even in parallel. Then continue to use TicksReversed+GUID for RowKey.

Single Query: PartitionKey ge 'ProductArea' and PartitionKey le 'ProductArea-~' and RowKey ge 'StartDateTimeReverseTicks' and RowKey le 'EndDateTimeReverseTicks'

Parallel Queries: PartitionKey ge 'ProductArea-1' and RowKey ge 'StartDateTimeReverseTicks' and RowKey le 'EndDateTimeReverseTicks' ... PartitionKey ge 'ProductArea-N' and RowKey ge 'StartDateTimeReverseTicks' and RowKey le 'EndDateTimeReverseTicks'

This solution doesn't satisfy 'automatically ordered by time', but you can do a client-side sort by RowKey to see them in order. If having to sort client-side is okay for you, then this solution should work to satisfy the rest of the constraints.

Ahlgren answered 28/4, 2015 at 16:18 Comment(5)
Thanks Brent. No, the solutions already presented are all re-phrasing of my two suggested ideas in the OP, with the exception of Guarav's suggestion to store two copies. However, you are correct that it is not necessary to use a GUID for a partition key; a number is sufficient. There is no point in using numbered product areas over labelled areas as you suggest, since all partitions will still need to be hit when reading. This just decreases readability.Parnell
Your interpretation of the solution constraints is close. 1) We use Azure, and are migrating from Azure SQL to Table Storage for cost reasons - but we are open to suggestions. 2) Correct. 3) Not necessarily, but Table Storage requires partitioning for performance and product area is a possible choice (this is the jist of the question). 4) Not "automatically" per se - when reading we want them ordered by time, BUT we can't do this client-side as Azure imposes a read-limit of 1000 entities per query. I will update the question with solution constraints.Parnell
storage library lets you query segmented (1000 entities as you mentioned) or unsegmented (all the entities which match the filter/query). You can write a simple helper method which pulls all all entities using continuation token (1000 at time till all entities are fetched). Once you have all the entities on the client side, you can do an OrderBy, which will be an O(n) operation. This handles #4.Sennar
@Sennar Yes, you can work around the segmentation with the continuation token, but this is not a scalable solution! As the number of entities increases performance will deteriorate quickly. This could be an augmentation to storing in blobs as your answer suggested.Parnell
you are right, querying say 100K rows with continuation token is going to be very slow, you will have to chunk your data properly with PK and RK and pull them in parallel using say a semaphoreSennar
D
1

Not really a concrete answer to your question, but here are some of my thoughts:

What you really need to think about is how are you going to query your data and design your data storage/partitioning strategy based on that (keeping in mind the Partitioning Strategy guid). For example,

  • If you need to view logs for all loggers within a given date/time range, then your current approach might not be appropriate because you would need to query across multiple partitions in parallel.
  • Your current approach would work if you want to query for specific logger within a given date/time range.
  • Another thing that was suggested to me is to make appropriate use of blob storage & table storage. If there's some data which does not require querying that often, you can simply push that data in blob storage (think about old logs - you don't really need to keep them in tables if you're not going to query them too often). Whenever you need such data, you can simply extract it from blob storage, push it in table storage and run your ad-hoc queries against that data.

Possible Solution

One possible solution would be to store multiple copies of the same data and use those copies accordingly. Since storage is cheap, you can save two copies of the same data. In 1st copy you could have PK = Date/Time and RK = whatever you decide and in 2nd copy you could have PK = Logger and RK = TicksReversed+GUID. Then when you want to fetch all logs irrespective of the logger, you could simply query the 1st copy (PK = Date/Time) and if you want to query logs for a specific logger type, you could simply query 2nd copy (PK = Logger, RK >= Date/Time Start & RK <= Date/Time End).

You may also find this link helpful: http://azure.microsoft.com/en-us/documentation/articles/storage-table-design-guide/

Deciduous answered 24/4, 2015 at 11:36 Comment(1)
Thanks for your answer Guarav. Your first two bullet points correctly summarise my problem. We don't yet have the need to begin archiving logs but it is something we will consider. Storing multiple copies is a solution I hadn't considered, however unless we made the second copy with a service at times of lull, the time-based PK first copy will still result in hot partitions and potentially lost logs.Parnell

© 2022 - 2024 — McMap. All rights reserved.