TL;DR I'm looking for a way to obtain IEqualityComparer<T>
from IComparer<T>
, no matter which datatype is T
, including case-insensitive options if T
is string
. Or I need a different solution for this problem.
Here's full story: I'm implementing simple, generic cache with LFU policy. Requirement is that it must be possible to select whether the cache will be case sensitive or case insensitive -- if string
happens to be the datatype for cache keys (which is not necessary). In the solution I primarily develop the cache for, I expect hundreds of billions of cache lookups, and cache sizes of max 100.000 entries. Because of that numbers I immediately resigned from using any string manipulation that causes allocations (such as .ToLower().GetHashCode()
etc.) and instead opted to use IComparer
and IEqualityComparer
, as they are standard BCL features. User of this cache can pass the comparers to constructor. Here are relevant fragments of the code:
public class LFUCache<TKey,TValue>
{
private readonly Dictionary<TKey,CacheItem> entries;
private readonly SortedSet<CacheItem> lfuList;
private class CacheItem
{
public TKey Key;
public TValue Value;
public int UseCount;
}
private class CacheItemComparer : IComparer<CacheItem>
{
private readonly IComparer<TKey> cacheKeyComparer;
public CacheItemComparer(IComparer<TKey> cacheKeyComparer)
{
this.cacheKeyComparer = cacheKeyComparer;
if (cacheKeyComparer == null)
this.cacheKeyComparer = Comparer<TKey>.Default;
}
public int Compare(CacheItem x, CacheItem y)
{
int UseCount = x.UseCount - y.UseCount;
if (UseCount != 0) return UseCount;
return cacheKeyComparer.Compare(x.Key, y.Key);
}
}
public LFUCache(int capacity, IEqualityComparer<TKey> keyEqualityComparer,
IComparer<TKey> keyComparer) // <- here's my problem
{
// ...
entries = new Dictionary<TKey, CacheItem>(keyEqualityComparer);
lfuList = new SortedSet<CacheItem>(new CacheItemComparer(keyComparer));
}
// ...
}
The keyEqualityComparer
is used to manage cache entries (so e.g. the key "ABC" and "abc" are equal if user wants to). The keyComparer
is used to manage cache entries sorted by UseCount
so that it's easy to select the least frequently used one (implemented in CacheItemComparer
class).
Example correct usage with custom comparison:
var cache = new LFUCache<string, int>(10000,
StringComparer.InvariantCultureIgnoreCase,
StringComparer.InvariantCultureIgnoreCase);
(That looks stupid, but StringComparer
implements both IComparer<string>
and IEqualityComparer<string>
.) The problem is that if user gives incompatible comparers (i.e. case insensitive keyEqualityComparer
and case sensitive keyComparer
), then the most likely outcome is invalid LFU statistics, and thus lower cache hits at best. The other scenario is also less than desired. Also if the key is more sophisticated (I'll have something resembling Tuple<string,DateTime,DateTime>
), it's possible to mess it up more severely.
That's why I'd like to only have a single comparer argument in constructor, but that doesn't seem to work. I'm able to create IEqualityComparer<T>.Equals()
with help of IComparer<T>.Compare()
, but I'm stuck at IEqualityComparer<T>.GetHashCode()
-- which is very important, as you know. If I had got access to private properties of the comparer to check if it's case sensitive or not, I would have used CompareInfo
to get hash code.
I like this approach with 2 different data structures, because it gives me acceptable performance and controllable memory consumption -- on my laptop around 500.000 cache additions/sec with cache size 10.000 elements. Dictionary<TKey,TValue>
is just used to find data in O(1), and SortedSet<CacheItem>
inserts data in O(log n), find element to remove by calling lfuList.Min
in O(log n), and find the entry to increment use count also in O(log n).
Any suggestions on how to solve this are welcome. I'll appreciate any ideas, including different designs.
IEqualityComparer<T>
andIComparer<T>
. Then at least you don't have pass in the same object to two different parameters. – StagingSortedSet
by yourself. If you incrementUseCount
there could only be a change of one positon upwards in this set. Maybe there is an implementation of O(1) for touch. Insert should be O(1) because it's added at the end of the list (maybe the SortedSet will do this). I'll try a new sample implementation. – Buryat