AppFabric Caching - Proper use of DataCacheFactory and DataCache

Asked 24/9, 2010 at 5:0 Answered 2/12, 2010 at 5:5

I am looking for the most performant way to arrange usage of the datacache and datacache factory for AppFabric caching calls, for between 400 and 700 cache gets per page load (and barely any puts). It seems that using a single static DataCacheFactory (or possibly a couple in a round-robin setup) is the way to go.

Do I call GetCache("cacheName") for every DataCache object request, or do I make one static at the time DataCache factory is initialized and use that for all calls?

Do I have to handle exceptions, check for fail codes and attempt retries?

Do I have to consider contention when more than one thread tries to use the cache store and wants the same item (by key)?

Is there some kind of documentation which properly explores the design and usage of this?

Some information I have gathered so far from the forum:

http://social.msdn.microsoft.com/Forums/en-AU/velocity/thread/98d4f00d-3a1b-4d7c-88ba-384d3d5da915

"Creating the factory involves connecting to the cluster and can take some time. But once you have the factory object and the cache that you want to work with, you can simply reuse those objects to do puts and gets into the cache, and you should see much faster performance."

http://social.msdn.microsoft.com/Forums/en-US/velocity/thread/0c1d7ce2-4c1b-4c63-b525-5d8f98bb8a49

"Creating single DataCacheFactory (singleton) is more performing than creating multiple DataCacheFactory. you should not create DataCacheFactory for each call, it will have performance hit."

"Please try to encapsulate round-robin algorithm (having 3/4/5 factory instances) in your singleton and compare load-test results."

http://blogs.msdn.com/b/velocity/archive/2009/04/15/pushing-client-performance.aspx

"You can increase the number of clients to increase the cache throughput. But sometimes if you want to have smaller set of clients and increase throughput, a trick is to use multiple DataCacheFactory instances. The DataCacheFactory instance creates a connection to the servers (e..g if there are 3 servers, it will create 3 connections) and multiplexes all requests from the datacaches on to these connections. So if the put/get volume is very high, these TCP connections might be bottlenecked. So one way is to create multiple DataCacheFactory instances and then use the operations on them."

Here what is in use so far... the property is called and if the return value is not null an operation is performed.

private static DataCache Cache
{
    get
    {
        if (_cacheFactory == null)
        {
            lock (Sync)
            {
                if (_cacheFactory == null)
                {
                    try
                    {
                        _cacheFactory = new DataCacheFactory();
                    }
                    catch (DataCacheException ex)
                    {
                        if (_logger != null)
                        {
                            _logger.LogError(ex.Message, ex);
                        }
                    }
                }
            }
        }

        DataCache cache = null;

        if (_cacheFactory != null)
        {
            cache = _cacheFactory.GetCache(_cacheName);
        }

        return cache;
    }
}

See this question on Microsoft AppFabric forum: http://social.msdn.microsoft.com/Forums/en-AU/velocity/thread/e0a0c6fb-df4e-499f-a023-ba16afb6614f

Xanthate answered 24/9, 2010 at 5:0 Comment(1)

There is an answer to this now in the forum. Check the link above. – Xanthate 11/11, 2010 at 2:4

Here is the answer from the forum post:

Hi. Sorry for the delayed response, but I want to say that these are great questions and will probably be useful to others.

There shouldn't be a need for more than one DataCacheFactory per thread unless you are requiring different configurations. For example, if you programmatically configure the DataCacheFactory with the DataCacheFactoryConfiguration class, then you might want to create one that has local cache enabled and another that does not. In this case, you would use different DataCacheFactory objects depending on the configuration you require for your scenario. But other than differences in configuration, you should not see a performance gain by creating multiple DataCacheFactories.

On the same subject, there is a MaxConnectionsToServer setting (either programmatic in DataCacheFactoryConfiguration or in the application configuration file as an attribute of the dataCacheClient element). This determines the number of chennels per DataCacheFactory that are opened to the cache cluster. If you have high throughput requirements and also available CPU/Network bandwidth, increasing this setting to 3 or higher can increase throughput. We don't recommend increasing this without cause or to a value that is too high for your needs. You should change the value and then test your scenario to observe the results. We hope to have more official guidance on this in the future.

Once you have a DataCacheFactory, you do not need to call GetCache() multiple times to get multiple DataCache objects. Every call to GetCache() for the same cache on the same factory returns the same DataCache object. Also, once you have the DataCache object, you do not need to continue to call DataCacheFactory for it. Just store the DataCache object and continue to use it. However, do not let the DataCacheFactory object get disposed. The life of the DataCache object is tied to the DataCacheFactory object.

You should never have to worry about contention with Get requests. However, with Put/Add requests, there can be contention if multiple data cache clients are updating the same key at the same time. In this case, you will get an exception with an error code of ERRCA0017, RetryLater and a substatus of ES0005, KeyLatched. However, you can easily add exception handling and retry logic to attempt the update again when errors such as these occur. This can be done for RetryLater codes with various substatus values. For more information, see http://msdn.microsoft.com/en-us/library/ff637738.aspx. You can also use pessimistic locking by using the GetAndLock() and PutAndUnlock() APIs. If you use this method it is your responsibility to make sure that all cache clients use pessimistic locking. A Put() call will wipe out an object that was previously locked by GetAndLock().

I hope this helps. Like I said, we hope to get this type of guidance into some formal content soon. But it is better to share it here on the forum until then. Thanks!

Jason Roth

Xanthate answered 2/12, 2010 at 5:5 Comment(1)

MaxConnectionsToServer is a powerful configuration setting. It provides parallelism without explicit coding of same. Be sure to implement logic to catch DataCacheExceptions and look for ErrorCode of RetryLater or Timeout for retries however, because conflicts may occur much more frequently. – Thralldom 18/6, 2011 at 4:30

Do I call GetCache("cacheName") for every DataCache object request, or do I make one static at the time DataCache factory is initialized and use that for all calls?

I suppose really the answer should be; try it both ways and see if there's a difference, but one static DataCache seems to me to make more sense than a corresponding call to GetCache for every call to Get.

That 'Pushing Client Performance' article suggests that there's a sweet spot where the number of DataCacheFactory instances gets you maximum performance beyond which the memory overhead starts working against you - it's a shame they didn't give any guidelines (or even a rule of thumb) on where this spot might be.

I haven't come across any documentation on maximising performance - I think AppFabric is still too new for these guidelines to have been shaken out yet. I did have a look in the Contents for the Pro AppFabric book, but it seems much more concerned generally with the workflow (Dublin) side of AppFabric rather than the caching (Velocity) piece.

One thing I would say though: is there any possibility for you to cache 'chunkier' objects so you can make fewer calls to Get? Could you cache collections rather than individual objects and then unpack the collections on the client? 700 cache gets per page load seems to me to be a huge number!

Coomb answered 24/9, 2010 at 15:8 Comment(3)

Yes it is a huge number, but I'm adding a caching provider for a nasty system and cant really change its core caching requirements. Thanks for your comments, still it highlights the fact there is not much good information out there for my specific problem. – Xanthate 26/9, 2010 at 23:1

With respect to pulling large numbers of items from the cache, have you had a look at the tagging and region functionality yet? If not have a read through this blog post on the subject: blogs.msdn.com/b/skaufman/archive/2010/04/22/… – Baumbaugh 28/9, 2010 at 8:2

@Baumbaugh Tags and regions might be a way to simplify some of the code by reducing the number of calls to cache.Get, but I'm not sure they'd offer a significant performance benefit as I think the code would still be pulling the same number of items from the cache per page i.e. I think you'd still be hitting the same bottlenecks. – Coomb 28/9, 2010 at 14:3

Recommended topics

Hot tags