Another option is to just wrap SortedSet
, but instead of storing your type T
in it, you store the value tuple (T value, int counter)
where counter
goes up by 1 with each new instance of value
that is inserted. Essentially you're forcing the values to be distinct. You can efficiently use GetViewBetween()
to find the largest value of counter
for a particular value, then increment it to get the counter for a newly-added value. And unlike the count dictionary solution, you can use GetViewBetween()
to replicate the functionality equal_range
, lower_bound
, and upper_bound
gives in C++. Here is some code showing what I mean:
public class SortedMultiSet<T> : IEnumerable<T>
{
public void Add(T value)
{
var view = set.GetViewBetween((value, 0), (value, int.MaxValue));
int nextCounter = view.Count > 0 ? view.Max.counter + 1 : 0;
set.Add((value, nextCounter));
}
public bool RemoveOne(T value)
{
var view = set.GetViewBetween((value, 0), (value, int.MaxValue));
if (view.Count == 0) return false;
set.Remove(view.Max);
return true;
}
public bool RemoveAll(T value)
{
var view = set.GetViewBetween((value, 0), (value, int.MaxValue));
bool result = view.Count > 0;
view.Clear();
return result;
}
public SortedMultiSet<T> GetViewBetween(T min, T max)
{
var result = new SortedMultiSet<T>();
result.set = set.GetViewBetween((min, 0), (max, int.MaxValue));
return result;
}
public IEnumerator<T> GetEnumerator() =>
set.Select(x => x.value).GetEnumerator();
IEnumerator IEnumerable.GetEnumerator() =>
set.Select(x => x.value).GetEnumerator();
private SortedSet<(T value, int counter)> set =
new SortedSet<(T value, int counter)>();
}
Now you can write something like this:
var multiset = new SortedMultiSet<int>();
foreach (int i in new int[] { 1, 2, 2, 3, 4, 5, 5, 6, 7, 7, 8 })
{
multiset.Add(i);
}
foreach (int i in multiset.GetViewBetween(2, 7))
{
Console.Write(i + " "); // Output: 2 2 3 4 5 5 6 7 7
}
In the past, there were some issues where GetViewBetween()
ran in time O(output size), rather than time O(log n), but I think those have been resolved. At the time it would count up nodes to cache the count, it now uses hierarchical counts to perform Count operations efficiently. See this StackOverflow post and this library code.