A TreeSet or TreeMap that allow duplicates
Asked Answered
P

8

13

I need a Collection that sorts the element, but does not removes the duplicates.

I have gone for a TreeSet, since TreeSet actually adds the values to a backed TreeMap:

public boolean add(E e) {
    return m.put(e, PRESENT)==null;
}

And the TreeMap removes the duplicates using the Comparators compare logic

I have written a Comparator that returns 1 instead of 0 in case of equal elements. Hence in the case of equal elements the TreeSet with this Comparator will not overwrite the duplicate and will just sort it.

I have tested it for simple String objects, but I need a Set of Custom objects.

public static void main(String[] args)
{       
        List<String> strList = Arrays.asList( new String[]{"d","b","c","z","s","b","d","a"} );      
        Set<String> strSet = new TreeSet<String>(new StringComparator());       
        strSet.addAll(strList);     
        System.out.println(strSet); 
}

class StringComparator implements Comparator<String>
{
    @Override
    public int compare(String s1, String s2)
    {
        if(s1.compareTo(s2) == 0){
            return 1;
        }
        else{
            return s1.compareTo(s2);
        }
    }
}

Is this approach fine or is there a better way to achieve this?

EDIT

Actually I am having a ArrayList of the following class:

class Fund 
{
    String fundCode;
    BigDecimal fundValue;
    .....

    public boolean equals(Object obj) {
    // uses fundCode for equality
    }
}

I need all the fundCode with highest fundValue

Pastrami answered 7/3, 2014 at 13:44 Comment(8)
Would keeping a count of the number of occurrences of each element be good enough for you? (In other words, in your real code are the duplicates utterly equivalent, or do you need to preserve some differences? An example would be a case-insensitive-but-case-preserving set or map.)Gash
This won't be a Set. You need a sorted list or something similar. From javadoc: A Set is a Collection that cannot contain duplicate elements. It's not a good idea to break the contract.Farmelo
https://mcmap.net/q/158645/-which-java-collection-should-i-useBugbear
If you can use 3rd part libraries, then maybe Guava libraries will be helpful. See docs.guava-libraries.googlecode.com/git/javadoc/com/google/… (more info about collections: code.google.com/p/guava-libraries/wiki/…)Conversation
@JonSkeet, Actually I am having a ArrayList of Fund Class and the equality is checked by fundCode. I need the all the Fund Objects with the highest fund value. I have updated my question accordinglyPastrami
possible duplicate of Sorted collection in JavaCitrate
@Farmelo OK, I deleted my comment. I was to much influenced by OP's idea of a sorted data structure (i.e. a list which sorts on insert operations). An externally sorted list is of course valid.Citrate
Possible duplicate of Why is there no SortedList in Java?Racine
C
3

I need all the fundCode with highest fundValue

If that's the only reason why you want to sort I would recommend not to sort at all. Sorting comes mostly with a complexity of O(n log(n)). Finding the maximum has only a complexity of O(n) and is implemented in a simple iteration over your list:

List<Fund> maxFunds = new ArrayList<Fund>();
int max = 0;
for (Fund fund : funds) {
    if (fund.getFundValue() > max) {
        maxFunds.clear();
        max = fund.getFundValue();

    }
    if (fund.getFundValue() == max) {
        maxFunds.add(fund);

    }
}

You can avoid that code by using a third level library like Guava. See: How to get max() element from List in Guava

Citrate answered 7/3, 2014 at 14:20 Comment(0)
W
13

You can use a PriorityQueue.

PriorityQueue<Integer> pQueue = new PriorityQueue<Integer>(); 

PriorityQueue(): Creates a PriorityQueue with the default initial capacity (11) that orders its elements according to their natural ordering.

This is a link to doc: https://docs.oracle.com/javase/8/docs/api/java/util/PriorityQueue.html

Wellesz answered 17/2, 2019 at 22:52 Comment(1)
This hints should be available in the official documentation of TreeSet. After few minutes of googling finally I found this SO page. I know Set doesn't allow duplicates, but the natural order functionality during insertion brought me to TreeSetKendricks
T
3

you can sort a List using Collections.sort.

given your Fund:

List<Fund> sortMe = new ArrayList(...);
Collections.sort(sortMe, new Comparator<Fund>() {
  @Override
  public int compare(Fund left, Fund right) {
    return left.fundValue.compareTo(right.fundValue);
  }
});
// sortMe is now sorted
Taxaceous answered 7/3, 2014 at 14:7 Comment(0)
C
3

I need all the fundCode with highest fundValue

If that's the only reason why you want to sort I would recommend not to sort at all. Sorting comes mostly with a complexity of O(n log(n)). Finding the maximum has only a complexity of O(n) and is implemented in a simple iteration over your list:

List<Fund> maxFunds = new ArrayList<Fund>();
int max = 0;
for (Fund fund : funds) {
    if (fund.getFundValue() > max) {
        maxFunds.clear();
        max = fund.getFundValue();

    }
    if (fund.getFundValue() == max) {
        maxFunds.add(fund);

    }
}

You can avoid that code by using a third level library like Guava. See: How to get max() element from List in Guava

Citrate answered 7/3, 2014 at 14:20 Comment(0)
D
2

While it's not possible directly, there are few workarounds.


What you should never do under any circumstances

It might not seem as obvious, but messing with Comparator argument or with compareTo(T other) method is unacceptable, and IDEs will usually show that something is not right. The problem with this approach is that any broken comparison algorithm breaks TreeSet in unexpected ways. For example, instantiating it like that

new TreeSet<Integer>((x, y) -> y - x == 0 ? 1 : y - x);

would surely allow you to store duplicate Integer elements within such TreeSet. Nevertheless, it breaks TreeSet in half because every element stored is now absolutely unremovable. The remove method would do nothing at all because TreeSet cannot find any element equal to one passed into remove method (remember that TreeSet only compares elements by calling Comparator or compareTo method and never uses the equals method).


Worse workaround

Instead of passing elements of type T directly, you can create a wrapping class that contains T along with some id (UUID for example).

record TreeSetEntry<T>(T value, UUID uuid) {
    TreeSetEntry(T value) {
        this(value, UUID.randomUUID());
    }
}

Passing this as a type for TreeSet (with a proper comparator, of course) would create one that can accept TreeSetEntry as elements with equal values.

new TreeSet<>(Comparator.comparingInt(TreeSetEntry<Integer>::value).thenComparing(TreeSetEntry::uuid));

This way of overcoming TreeSet's limitations while being correct on its own comes with a cost of massive memory overhead because for each value two additional objects would be created.


Way better approach

Really, in case you need to store duplicates in TreeSet as separate elements, the only valuable information you can store is what an element is and how many times it appears in TreeSet right now. Replacing TreeSet<T> with TreeMap<T, Long> is the best approach I've found. Of course, managing the presence and absence of value as a key inside TreeMap is required, but it's possible to automate this process either by inheritance or delegation and then introduce methods for adding or removing a value unit.

The inheritance example is shown below. This approach retains all the functionality from TreeMap.

class TreeCounter<T> extends TreeMap<T, Long> {
    TreeCounter(Comparator<T> comparator) {
        super(comparator);
    }

    void increase(T item) {
        increase(item, 1);
    }

    void increase(T item, int difference) {
        var count = this.getOrDefault(item, 0L);
        this.put(item, count + difference);
    }

    void decrease(T item) {
        decrease(item, 1);
    }

    void decrease(T item, int difference) {
        var currentCount = this.getOrDefault(item, 0L);
        if (currentCount <= 1) {
            this.remove(item);
            return;
        }

        this.put(item, currentCount - 1);
    }
}

Being aware of the fact that the TreeSet and TreeMap in Java in particular both allocate approximately equal amounts of memory, I can safely say this workaround is insanely memory efficient, especially in cases where a lot of duplicate elements need to get stored.

Dwyer answered 28/8, 2023 at 18:25 Comment(0)
F
0

In case of TreeSet either Comparator or Comparable is used to compare and store objects . Equals are not called and that is why it does not recognize the duplicate one

Fording answered 20/2, 2015 at 20:50 Comment(0)
P
0

Instead of the TreeSet we can use List and implement the Comparable interface.

public class Fund implements Comparable<Fund> {

    String fundCode;
    int fundValue;

    public Fund(String fundCode, int fundValue) {
        super();
        this.fundCode = fundCode;
        this.fundValue = fundValue;
    }

    public String getFundCode() {
        return fundCode;
    }

    public void setFundCode(String fundCode) {
        this.fundCode = fundCode;
    }

    public int getFundValue() {
        return fundValue;
    }

    public void setFundValue(int fundValue) {
        this.fundValue = fundValue;
    }

    public int compareTo(Fund compareFund) {

        int compare = ((Fund) compareFund).getFundValue();
        return compare - this.fundValue;
    }

    public static void main(String args[]){

        List<Fund> funds = new ArrayList<Fund>();

        Fund fund1 = new Fund("a",100);
        Fund fund2 = new Fund("b",20);
        Fund fund3 = new Fund("c",70);
        Fund fund4 = new Fund("a",100);
        funds.add(fund1);
        funds.add(fund2);
        funds.add(fund3);
        funds.add(fund4);

        Collections.sort(funds);

        for(Fund fund : funds){
            System.out.println("Fund code: " + fund.getFundCode() +  "  Fund value : " + fund.getFundValue());
        }
    }
}
Pedicle answered 24/3, 2015 at 18:41 Comment(0)
P
0

Add the elements to the arraylist and then sort the elements using utility Collections.sort,. then implement comparable and write your own compareTo method according to your key.

wont remove duplicates as well, can be sorted also:

List<Integer> list = new ArrayList<>();

Collections.sort(list,new Comparator<Integer>() 
{

  @Override


  public int compare(Objectleft, Object right) {


**your logic**

     return '';

  }

}

)
;
Propertius answered 13/7, 2016 at 17:27 Comment(0)
G
-1

I found a way to get TreeSet to store duplicate keys.

The problem originated when I wrote some code in python using SortedContainers. I have a spatial index of objects where I want to find all objects between a start/end longitude.

The longitudes could be duplicates but I still need the ability to efficiently add/remove specific objects from the index. Unfortunately I could not find the Java equivalent of the Python SortedKeyList that separates the sort key from the type being stored.

To illustrate this consider that we have a large list of retail purchases and we want to get all purchases where the cost is in a specific range.

// We are using TreeSet as a SortedList
TreeSet _index = new TreeSet<PriceBase>()

// populate the index with the purchases. 
// Note that 2 of these have the same cost
_index.add(new Purchase("candy", 1.03));
Purchase _bananas = new Purchase("bananas", 1.45);
_index.add(new Purchase(_bananas);
_index.add(new Purchase("celery", 1.45));
_index.add(new Purchase("chicken", 4.99));

// Range scan. This iterator should return "candy", "bananas", "celery"
NavigableSet<PriceBase> _iterator = _index.subset(
    new PriceKey(0.99), new PriceKey(3.99));

// we can also remove specific items from the list and
// it finds the specific object even through the sort
// key is the same
_index.remove(_bananas);

There are 3 classes created for the list

  • PriceBase: Base class that returns the sort key (the price).
  • Purchase: subclass that contains transaction data.
  • PriceKey: subclass used for the range search.

When I initially implemented this with TreeSet it worked except in the case where the prices are the same. The trick is to define the compareTo() so that it is polymorphic:

  1. If we are comparing Purchase to PriceKey then only compare the price.
  2. If we are comparing Purchase to Purchase then compare the price and the name if the prices are the same.

For example here are the compareTo() functions for the PriceBase and Purchase classes.

// in PriceBase
@Override
public int compareTo(PriceBase _other) {
    return Double.compare(this.getPrice(), _other.getPrice());
}

// in Purchase
@Override
public int compareTo(PriceBase _other) {

    // compare by price
    int _compare = super.compareTo(_other);

    if(_compare != 0) {
        // prices are not equal
        return _compare;
    }

    if(_other instanceof Purchase == false) {
        throw new RuntimeException("Right compare must be a Purchase");
    }

    // compare by item name
    Purchase _otherPurchase = (Purchase)_other;
    return this.getName().compareTo(_otherChild.getName());
}

This trick allows the TreeSet to sort the purchases by price but still do a real comparison when one needs to be uniquely identified.

In summary I needed an object index to support a range scan where the key is a continuous value like double and add/remove is efficient.

I understand there are many other ways to solve this problem but I wanted to avoid writing my own tree class. My solution seems like a hack and I am surprised that I can't find anything else. if you know of a better way then please comment.

Gabriella answered 5/3, 2020 at 19:8 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.