Does WeakReference make a good cache?
Asked Answered
J

7

20

i have a cache that uses WeakReferences to the cached objects to make them automatically removed from the cache in case of memory pressure. My problem is that the cached objects are collected very soon after they have been stored in the cache. The cache runs in a 64-Bit application and in spite of the case that more than 4gig of memory are still available, all the cached objects are collected (they usually are stored in the G2-heap at that moment). There are no garbage collection induced manually as the process explorer shows.

What methods can i apply to make the objects live a litte longer?

Jephum answered 30/5, 2009 at 17:46 Comment(0)
A
18

Using WeakReferences as the primary means of referencing cached objects is not really a great idea, because as Josh said, your at the mercy of any future behavioral changes to WeakReference and the GC.

However, if your cache needs any kind of resurrection capability, use of WeakReferences for items that are pending purge is useful. When an item meets eviction criteria, rather than immediately evicting it, you change its reference to a weak reference. If anything requests it before it is GC'ed, you restore its strong reference, and the object can live again. I have found this useful for some caches that have hard to predict hit rate patterns with frequent enough "resurrections" to be beneficial.

If you have predictable hit rate patterns, then I would forgoe the WeakReference option and perform explicit evictions.

Avalon answered 30/5, 2009 at 18:26 Comment(11)
The problem is that as soon as i have a strong reference the a cached object, i am in danger of running run out of memory before i can set it as collectible.Jephum
Thats the job of your eviction strategy. Every cache needs two main things...the ability to register an instance in the cache, and a background process that determines which items should be evicted base on a set of rules. There are many existing eviction strategies that may work: LRU (Least Recently Used...i.e. oldest out), LFU (Least Frequently Used...i.e. 10 hits gets dropped in favor of the items with 100 hits), etc. You can't cache every object that gets added forever...you gotta add a background thread to handle eviction.Avalon
But that background thread will clean the cache only every now and then and eventually too late.Jephum
Thats up to you. It will clean it up as frequently as you need it yo. Or, if that doesn't work, you can use thread signaling to have it clean up on a lazy schedule, plus on demand. For example, you could use a ManualResetEvent to have your main cache code signal its background worker that it should wake up immediately and purge, reset the event, and go back to sleep. And you have me wondering now what exactly your caching and how much of it. If you are seriously worried that your CACHE, of all things, is going to consume more memory than a 64bit machine offers...it sounds like you have otherAvalon
problems. You need to consider that what your caching and the volume of items that your caching really, truely do need to be cached, because if your consuming every available scrap of memory your process has on a 64bit system, something is gravely wrong.Avalon
It's quite simple: The cache should use all free (physical) memory that is available, but as soon as another process or another method in the same process is requiring the memory, it should remove old items.Jephum
I am honestly not sure how you could accomplish that. Your "cache" would need to not only be aware of its own memory usage, but also that of the entire program, AND external, unrelated processes. Using WeakReferences as the dominant form of reference in your cache is not going to work, because your objects are going to get collected as soon as possible, defeating the purpose of having the cache. You are actually going to need to hook into the system and watch for messages regarding memory requests...and that is going to severely impact the performance of your cache...rendering it useless.Avalon
Just out of curiosity...why do you need to consume every scrap of memory? Seems highly inefficient...can you give us a more high-level overview of what your trying to do that would require such a massive cache?Avalon
Unused memory is useless memory, so why not using it? The cache should store a few large tables with each several hundreds of MB that are expensive to fetch, so i want to keep as many of them in memory if it is available, but not take the memory take it away from the rest of the system.Jephum
Well, assuming you actually need to cache hundreds of megs of database tables, then I guess you do what you gotta do. Just make sure that your not prematurely optimizing a performance issue that may never exist. Database servers these days are incredibly efficient applications, and handle caching of data pages about as optimally as is possible. Unless your writing a database yourself, I would say let the database do what it does best, and put effort into making your application do something useful before you worry about filling "wasted" memory just for the sake of filling it.Avalon
The database may be fast, but the network may be not. But even if the database runs on the same machine as the application server, it takes some time to load a table with some million rows into the memory of the application server, which proofed to be the bottleneck in our scenario. Having it already there is a nice thing.Jephum
W
7

There is one situation where a WeakReference-based cache may be good: when the usefulness of an item in the class is predicated upon the existence of a reference to it. In such a situation, a weak interning cache may be useful. For example, if one had an application which would deserialize many large immutable objects, many of which were expected to be duplicates, and would have to perform many comparisons between them. If X and Y are references to some immutable class type, testing X.Equals(Y) will be very fast if both variables point to the same instance, but may be very slow if they point to distinct instances that happen to be equal. If a deserialized object happens to match another object to which a reference already exists, fetching a from the dictionary a reference to that latter object (requiring one slow comparison) may expedite future comparisons. On the other hand, if it matched an item in the dictionary but the dictionary was the only reference to that item, there would be little advantage to using the dictionary object instead of simply keeping the object that was read in; probably not enough advantage to justify the cost of the comparison. For an interning cache, having WeakReferences get invalidated as soon as possible once no other references exist to an object would be a good thing.

Whale answered 30/10, 2012 at 3:24 Comment(0)
K
5

In .net, a WeakReference is not considered a reference from the GC standpoint at all, so any object that only has weak references will be collected in the next GC run (for the appropriate generation).

That makes weak reference completely inappropriate for caching - as your experience shows.

You need a "real" cache component, and the most important thing about caching is to get one where the eviction policy (that is, the rules about when to drop an object from the cache) are a good match for you application's usage pattern.

Kirwan answered 30/5, 2009 at 19:57 Comment(0)
K
3

No, WeakReference is not good for that because the behavior of the garbage collector can and will change over time and your cache should not be relying on today's behavior. Also many factors outside of your control could affect memory pressure.

There are many implementations of a cache for .NET. You could find probably a dozen on CodePlex. I guess what you need to add to it is something that looks at the application's current working set to use that as a trigger for purging.

One more note about why your objects are being collected so frequently. The GC is very aggressive at cleaning up Gen0 objects. If your objects are very short-lived (up until the only reference to it is a weak reference) then the GC is doing what it's designed to do by cleaning up as quickly as it can.

Kathernkatheryn answered 30/5, 2009 at 17:50 Comment(3)
"a dozen on CodePlex" ... and one in the Framework. The ASP.NET Cache System.Web.Caching.Cache can be used in non-ASP.NET applications and is quite powerful. Microsoft documentation says it's not recommended to use it in client apps (but doesn't really say why), but I've used it in a wide variety of apps successfully.Fugal
I tried HttpRuntime.Cache, but it doesn't work as i like. I continously added items to the cache and soon got an OutOfMemoryException instead of having the cache evicting items.Jephum
"HttpRuntime.Cache, but it doesn't work as i like... OutOfMemoryException". Surprising, but you could try adjusting the cache configuration - e.g. privateBytesLimit and percentagePhysicalMemoryUsedLimit properties. Or there may be some other reason for the OutOfMemoryException.Fugal
L
2

I believe the problem you are having is that the Garbage Collector removes weakly referenced objects in response not only in response to memory pressure - instead it will do collection quite aggressively sometimes just because the runtime system thinks some objects may likely have become unreachable.

You may be better off using e.g. System.Runtime.Caching.MemoryCache which can be configured with a memory limit, or custom eviction policies for the items.

Lizzielizzy answered 16/9, 2013 at 14:49 Comment(0)
T
0

The answer actually depends on usage characteristics of the cache you are trying to build. I have successfully used WeakReference based caching strategy for improving performance in many of my projects where the cached objects are expected to be used in short bursts of multiple reads. As others pointed out, the weak references are pretty much garbage from GC's point of view and will be collected whenever the next GC cycle is run. It's nothing to do with the memory utilization.

If, however, you need a cache that survives such brutality from GC, you need to use or mimic the functionality provided by System.Runtime.Caching namespace. Keep in mind that you'd need an additional thread that cleans up the cache when the memory usage is crossing your thresholds.

Tideland answered 15/5, 2013 at 4:14 Comment(0)
B
0

A bit late, but here's a relevant use case:

I need to cache two types of objects: large (deserialised) data files that take 10 minutes to load and cost 15G of ram each, and smaller (dynamically compiled) objects that contain internal references to those data files (the smaller objects are also cached because they take ~10s to generate). These caches are hidden within the factories that supply the objects (the former component having no knowledge of the latter), and have different eviction policies.

When my `data file' cache evicts an object, it replaces it by a weak reference, so if that object is still available when next requested, we can resurrect it (and renew its cache timeout). In this way we avoid losing (or accidentally duplicating) any object before it is truly defunct (i.e. not used anywhere else). Notice that neither cache is required to be aware of the other, and that no other client objects need to be aware that there are any caches at all (eg: we avoid needing 'keepalives', callbacks, registration, retrieve-and-return scopes, etc - things get a lot simpler).

So although using WeakReference by itself (instead of a cache) is a terrible idea (because modern GCs are typically tuned to the size of the L2 CPU cache, and regular code will burn through this many times per minute), it's very useful as a way to hide your caches from the rest of your code.

Bostick answered 21/2, 2015 at 2:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.