Maybe I am not using the right data structure. I need to use a set, but also want to efficiently return the k-th smallest element. Can TreeSet
in Java do this? There seems no built-in method of TreeSet
to do this.
I don't believe that TreeSet
has a method that directly does this. There are binary search trees that do support O(log n) random access (they are sometimes called order statistic trees), and there are Java implementations of this data structure available. These structures are typically implemented as binary search trees that store information in each node counting how many elements are to the left or right of the node, so a search down the tree can be made to find the appropriate element by descending into the appropriate subtree at each step. The classic "Introduction to Algorithms, Third Edition" book by Cormen, Rivest, Leisserson, and Stein explores this data structure in their chapter "Augmenting Data Structures" if you are curious how to implement one yourself.
Alternatively, you may be able (in some cases) to use the TreeSet
's tailSet
method and a modified binary search to try to find the kth element. Specifically, look at the first and last elements of the TreeSet
, then (if possible given the contents) pick some element that is halfway between the two and pass it as an argument to tailSet
to get a view of the elements of the set after the midpoint. Using the number of elements in the tailSet
, you could then decide whether you've found the element, or whether to explore the left or right halves of the tree. This is a slightly modified interpolation search over the tree, and could potentially be fast. However, I don't know the internal complexity of the tailSet
methods, so this could be actually be worse than the order statistic tree. It also might fail if you can't compute the "midpoint" of two elements, for example, if you are storing String
s in your TreeSet
.
tailSet
or headSet
of a TreeSet
requires iterating through the elements and counting them. So your suggestion about that probably isn't going to help. –
Desired You just need to iterate to element k. One way to do that would be to use one of Guava's Iterables.get methods:
T element = Iterables.get(set, k);
There's no built in method to do this because a Set
is not a List
and index-based operations like that are generally the reserved for List
s. A TreeSet
is more appropriate for things like finding the closest contained element that is >= some value.
One thing you could do if the fastest possible access to the kth smallest element were really important would be to use an ArrayList
rather than a TreeSet
and handle inserts by binary searching for the insertion point and either inserting the element at that index or replacing the existing element at that index, depending on the result of the search. Then you could get the kth smallest element in O(1) by just calling get(k)
.
You could even create an implementation of SortedSet
that handles all that and adds the get(index)
method if you really wanted.
Use TreeSet.iterator() to get an iterator in ascending order and call next()
K times:
// Example for Integers
Iterator<Integer> it = treeSet.iterator();
int i = 0;
Integer current = null;
while(it.hasNext() && i < k) {
current = it.next();
i++;
}
https://github.com/geniot/indexed-tree-map
I had the same problem. So I took the source code of java.util.TreeMap and wrote IndexedTreeMap. It implements my own IndexedNavigableMap:
public interface IndexedNavigableMap<K, V> extends NavigableMap<K, V> {
K exactKey(int index);
Entry<K, V> exactEntry(int index);
int keyIndex(K k);
}
The implementation is based on updating node weights in the red-black tree when it is changed. Weight is the number of child nodes beneath a given node, plus one - self. For example when a tree is rotated to the left:
private void rotateLeft(Entry<K, V> p) {
if (p != null) {
Entry<K, V> r = p.right;
int delta = getWeight(r.left) - getWeight(p.right);
p.right = r.left;
p.updateWeight(delta);
if (r.left != null) {
r.left.parent = p;
}
r.parent = p.parent;
if (p.parent == null) {
root = r;
} else if (p.parent.left == p) {
delta = getWeight(r) - getWeight(p.parent.left);
p.parent.left = r;
p.parent.updateWeight(delta);
} else {
delta = getWeight(r) - getWeight(p.parent.right);
p.parent.right = r;
p.parent.updateWeight(delta);
}
delta = getWeight(p) - getWeight(r.left);
r.left = p;
r.updateWeight(delta);
p.parent = r;
}
}
updateWeight simply updates weights up to the root:
void updateWeight(int delta) {
weight += delta;
Entry<K, V> p = parent;
while (p != null) {
p.weight += delta;
p = p.parent;
}
}
And when we need to find the element by index here is the implementation that uses weights:
public K exactKey(int index) {
if (index < 0 || index > size() - 1) {
throw new ArrayIndexOutOfBoundsException();
}
return getExactKey(root, index);
}
private K getExactKey(Entry<K, V> e, int index) {
if (e.left == null && index == 0) {
return e.key;
}
if (e.left == null && e.right == null) {
return e.key;
}
if (e.left != null && e.left.weight > index) {
return getExactKey(e.left, index);
}
if (e.left != null && e.left.weight == index) {
return e.key;
}
return getExactKey(e.right, index - (e.left == null ? 0 : e.left.weight) - 1);
}
Also comes in very handy finding the index of a key:
public int keyIndex(K key) {
if (key == null) {
throw new NullPointerException();
}
Entry<K, V> e = getEntry(key);
if (e == null) {
throw new NullPointerException();
}
if (e == root) {
return getWeight(e) - getWeight(e.right) - 1;//index to return
}
int index = 0;
int cmp;
if (e.left != null) {
index += getWeight(e.left);
}
Entry<K, V> p = e.parent;
// split comparator and comparable paths
Comparator<? super K> cpr = comparator;
if (cpr != null) {
while (p != null) {
cmp = cpr.compare(key, p.key);
if (cmp > 0) {
index += getWeight(p.left) + 1;
}
p = p.parent;
}
} else {
Comparable<? super K> k = (Comparable<? super K>) key;
while (p != null) {
if (k.compareTo(p.key) > 0) {
index += getWeight(p.left) + 1;
}
p = p.parent;
}
}
return index;
}
You can find the result of this work at https://github.com/geniot/indexed-tree-map.
TreeSet<Integer> a=new TreeSet<>();
a.add(1);
a.add(2);
a.add(-1);
System.out.println(a.toArray()[0]);
it can be helpfull
[Below, I abbreviate "kth smallest element search operation" as "Kth op."]
You need to give more details. Which operations will your data structure provide? is K in Kth operation very small compared to N, or can it be anything? How often will you have insertions & deletions compared to look ups? How often will you have Kth smallest element search compared to look ups? Are you looking for a quick solution of couple of lines within Java library, or are you willing to spend some effort to build a custom data structure?
The operations to provide could be any subset of:
LookUp (find an element by its key; where key is comparable and can be anything)
Insert
Delete
Kth
Here are some possibilities:
If there will be no/very few insertions&deletions, you can just sort the elements and use an array, with O(Log(N)) look up time and O(1) for Kth.
If O(Log(N)) for LookUp, Insert, Delete and O(k) for Kth op. is good enough, probably the easiest implementation would be Skip Lists. (Wikipedia article is very good if you need more detail)
If K is small enough, or Kth operations will only come after "insertions&deletions phase" you can keep the smallest K elements in a heap, sorting after the insertions&deletions for O(N + k Log k) time. (You will also need a seperate Hash for LookUp)
If K is arbitrary and O(N) is good enough for Kth operation, you can use a Hash for O(1) time lookup, and use a "one-sided-QuickSort" algorithm for Kth operations (the basic idea is do a quick sort but on every binary divide recurse only on the side you really need; which would give (this is a gross simplification) N (1/2 + 1/4 + 1/8 + ... ) = O(N) expected time)
You can build an Augmented "simple" Interval Tree structure with each node keeping the number of his children, so that LookUp, Insert, Delete, Kth all compute in O(Log N) time as long as the tree is balanced but perhaps it would not be difficult to implement if you are a novice.
etc. etc. The set of alternatives is infinite as the possible interpretations of your question.
Could you use a ConcurrentSkipListSet and use the toArray() method? ConcurrentSkipListSet is sorted by the natural order of elements. The only thing I am not sure about is if the toArray() is O(n) or since it's backed by a List (backed by an array, like ArrayList) it's O(1).
If toArray() is O(1) the you should be able to be a skipList.toArray()[k] to get the k-th smallest element.
I know this question is quite old, but since TreeSet implements NavigableSet you have access to the subSet method which runs in constant time.
subSet(k, k + 1).first();
The first() call takes log(n) time where n is the size of the original set. This does create some unnecessary objects which could be avoided with a more robust implementation of TreeSet, but it avoids using a third party library.
© 2022 - 2024 — McMap. All rights reserved.