Is there a no-duplicate List implementation out there?
Asked Answered
I

12

102

I know about SortedSet, but in my case I need something that implements List, and not Set. So is there an implementation out there, in the API or elsewhere?

It shouldn't be hard to implement myself, but I figured why not ask people here first?

Ironbark answered 6/11, 2008 at 13:25 Comment(3)
Why does it need to implement List? Sets are iterable, like lists, so I suppose the receiving method is enforcing List for some other reason.Heptad
@Heptad That's right, it's an external demand, and the data structure includes a hell of a lot more than one List.Ironbark
If the user wants a LIST, then it's clear that needs methods of the LIST interface that are not present um the SET interface...Copalite
T
106

There's no Java collection in the standard library to do this. LinkedHashSet<E> preserves ordering similarly to a List, though, so if you wrap your set in a List when you want to use it as a List you'll get the semantics you want.

Alternatively, the Commons Collections (or commons-collections4, for the generic version) has a List which does what you want already: SetUniqueList / SetUniqueList<E>.

Tolkan answered 6/11, 2008 at 13:38 Comment(7)
The Commons class is exactly what I need, but my boss told me to implement it myself eventually. 10x anyway!Ironbark
Ah well, nothing like reinventing the wheel! You'll know now if the need comes up again, anyway. collections15 is a pretty useful thing to have kicking around; MultiMaps in particular ease the pain of something one ends up implementing oneself a lot.Tolkan
@skaffman: he's not actually an idiot, but sometimes he makes moves that are... well, odd. Anyway, I'm not gonna introduce bugs into the product. In today's market, I'm happy with my job and not looking to slam doors and burn bridges, if you get my point.Ironbark
I'm quite surprise when SetUniqueList doesn't have parameterized type.Evetta
wrapping LinkedHashSet in a list doesn't make it easy or efficient to implement all of the index-based operations on a list. i don't think that's a reasonable approach. and also, there are reason not to pull in the commons things, for example, on a mobile platform where size matters.Excommunicative
Jeffrey: On mobile platforms the system will usually remove unused classes, but sure, there's plenty of reasons you might not go down one of these "normal" solutions. There's always some trade-off to be made, and no solution will fix all cases.Tolkan
Just keep in mind, sorting will NOT work on that "list"Rempe
A
25

Here is what I did and it works.

Assuming I have an ArrayList to work with the first thing I did was created a new LinkedHashSet.

LinkedHashSet<E> hashSet = new LinkedHashSet<E>()

Then I attempt to add my new element to the LinkedHashSet. The add method does not alter the LinkedHasSet and returns false if the new element is a duplicate. So this becomes a condition I can test before adding to the ArrayList.

if (hashSet.add(E)) arrayList.add(E);

This is a simple and elegant way to prevent duplicates from being added to an array list. If you want you can encapsulate it in and override of the add method in a class that extends the ArrayList. Just remember to deal with addAll by looping through the elements and calling the add method.

Adaptation answered 24/4, 2014 at 17:36 Comment(3)
Yeah, I think, this is the best solution for it, you can also simply use a normal HashSet, not a Linked, and then you can use your list as you want, you can also deside what to do in some situations, like in adding an element inside a list before a specific index, you can deside that you want to move the duplicated item to this position or not.Mame
Best solution here... Will post my UniqueList class codeCopalite
This worked for me, in my BFS Graph algorithm. Because I had some nodes that I added to a Queue (LinkedList) just if they weren't already in.Residuary
I
12

So here's what I did eventually. I hope this helps someone else.

class NoDuplicatesList<E> extends LinkedList<E> {
    @Override
    public boolean add(E e) {
        if (this.contains(e)) {
            return false;
        }
        else {
            return super.add(e);
        }
    }

    @Override
    public boolean addAll(Collection<? extends E> collection) {
        Collection<E> copy = new LinkedList<E>(collection);
        copy.removeAll(this);
        return super.addAll(copy);
    }

    @Override
    public boolean addAll(int index, Collection<? extends E> collection) {
        Collection<E> copy = new LinkedList<E>(collection);
        copy.removeAll(this);
        return super.addAll(index, copy);
    }

    @Override
    public void add(int index, E element) {
        if (this.contains(element)) {
            return;
        }
        else {
            super.add(index, element);
        }
    }
}   
Ironbark answered 6/11, 2008 at 14:15 Comment(4)
Be careful - LinkedList.contains() needs to scan the entire list to determine if an object is contained in the List. This means that when you are adding objects to a large List, the entire List is scanned for each add operation (in the worst case). This can end up being SLOW.Nestornestorian
Also, your addAll override doesn't check for duplicates in the collection being passed to addAll().Nestornestorian
@mattb How would you solve this problem then: On Android, when binding objects to a list item view, we are given the position of the item in the view adapter. Since sets have no index, the only way is to check whether or not the object exists when using lists is to iterate through and look for an existing copy.Orangeade
the performance-problems of this solution could be fixed with a simple, additional Set<Integer> which stores the hashCodes of the elements (instead of searching through the entire list) - which would require all of the elements to properly implement hashCode(), of course but with helper frameworks like Lombok, this really is no problem ... its kind of trivial, actually. One could even optimize that solution with a red/black-tree for hashCodes ... small memory overhead for large performance gains; welcome to the world of cloud computing ;-)Overelaborate
M
6

Why not encapsulate a set with a list, sort like:

new ArrayList( new LinkedHashSet() )

This leaves the other implementation for someone who is a real master of Collections ;-)

Mollescent answered 6/11, 2008 at 13:32 Comment(3)
This constructor copies the contents of the Set into the new List, rather than wrapping it.Tolkan
@Calum, that is correct, but instead of worrying about not adding duplicates to a List, he can add his objects to a Set (and let the Set worry about filtering out duplicates) and just wrap that Set in a List when passing it to the external method.Nestornestorian
This copies a set to a list but you don't have any well-known ordering. But this what the question is all about.Wiggly
N
5

You should seriously consider dhiller's answer:

  1. Instead of worrying about adding your objects to a duplicate-less List, add them to a Set (any implementation), which will by nature filter out the duplicates.
  2. When you need to call the method that requires a List, wrap it in a new ArrayList(set) (or a new LinkedList(set), whatever).

I think that the solution you posted with the NoDuplicatesList has some issues, mostly with the contains() method, plus your class does not handle checking for duplicates in the Collection passed to your addAll() method.

Nestornestorian answered 6/11, 2008 at 16:28 Comment(4)
I'd love to learn of these contains() issues. As for the addAll(), I create a copy of the given collection and remove all objects already in 'this'. How does that not handle duplicates?Ironbark
As I mentioned in my comment to your class posting, contains() has to scan the entire list (in the worst case) to find if the object is contained in the list. If you have a list of 1 million items and add 10 it it individually, then (in the worst case) over ten million items are scanned.Nestornestorian
As for addAll(), if the Collection passed to addAll contains duplicates itself, they are not detected. For example: your list {A, B, C, D} parameter list {B, D, E, E, E}. You create a copy of the parameter, and after removeAll it contains {E, E, E}.Nestornestorian
The addAll() issue is not really relevant to me, as I use NoDuplicatesList throughout the procedure, and addAll() should be receiving another NoDuplicatesList as its parameter. What would you suggest to improve the contains() performance?Ironbark
C
3

I needed something like that, so I went to the commons collections and used the SetUniqueList, but when I ran some performance test, I found that it seems not optimized comparing to the case if I want to use a Set and obtain an Array using the Set.toArray() method.

The SetUniqueTest took 20:1 time to fill and then traverse 100,000 Strings comparing to the other implementation, which is a big deal difference.

So, if you worry about the performance, I recommend you to use the Set and Get an Array instead of using the SetUniqueList, unless you really need the logic of the SetUniqueList, then you'll need to check other solutions...

Testing code main method:

public static void main(String[] args) {


SetUniqueList pq = SetUniqueList.decorate(new ArrayList());
Set s = new TreeSet();

long t1 = 0L;
long t2 = 0L;
String t;


t1 = System.nanoTime();
for (int i = 0; i < 200000; i++) {
    pq.add("a" + Math.random());
}
while (!pq.isEmpty()) {
    t = (String) pq.remove(0);
}
t1 = System.nanoTime() - t1;

t2 = System.nanoTime();
for (int i = 0; i < 200000; i++) {
    s.add("a" + Math.random());
}

s.clear();
String[] d = (String[]) s.toArray(new String[0]);
s.clear();
for (int i = 0; i < d.length; i++) {
    t = d[i];

}
t2 = System.nanoTime() - t2;

System.out.println((double)t1/1000/1000/1000); //seconds
System.out.println((double)t2/1000/1000/1000); //seconds
System.out.println(((double) t1) / t2);        //comparing results

}

Regards, Mohammed Sleem

Cohesion answered 21/7, 2009 at 20:50 Comment(0)
C
1

My lastest implementation: https://github.com/marcolopes/dma/blob/master/org.dma.java/src/org/dma/java/util/UniqueArrayList.java

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.LinkedHashSet;

/**
 * Extends <tt>ArrayList</tt> and guarantees no duplicate elements
 */
public class UniqueArrayList<T> extends ArrayList<T> {

    private static final long serialVersionUID = 1L;

    public UniqueArrayList(int initialCapacity) {
        super(initialCapacity);
    }

    public UniqueArrayList() {
        super();
    }

    public UniqueArrayList(T[] array) {
        this(Arrays.asList(array));
    }

    public UniqueArrayList(Collection<? extends T> col) {
        addAll(col);
    }


    @Override
    public void add(int index, T e) {
        if (!contains(e)) super.add(index, e);
    }

    @Override
    public boolean add(T e) {
        return contains(e) ? false : super.add(e);
    }

    @Override
    public boolean addAll(Collection<? extends T> col) {
        Collection set=new LinkedHashSet(this);
        set.addAll(col);
        clear();
        return super.addAll(set);
    }

    @Override
    public boolean addAll(int index, Collection<? extends T> col) {
        Collection set=new LinkedHashSet(subList(0, index));
        set.addAll(col);
        set.addAll(subList(index, size()));
        clear();
        return super.addAll(set);
    }

    @Override
    public T set(int index, T e) {
        return contains(e) ? null : super.set(index, e);
    }

    /** Ensures element.equals(o) */
    @Override
    public int indexOf(Object o) {
        int index=0;
        for(T element: this){
            if (element.equals(o)) return index;
            index++;
        }return -1;
    }


}
Copalite answered 4/3, 2016 at 5:32 Comment(0)
H
0

The documentation for collection interfaces says:

Set — a collection that cannot contain duplicate elements.
List — an ordered collection (sometimes called a sequence). Lists can contain duplicate elements.

So if you don't want duplicates, you probably shouldn't use a list.

Highway answered 6/11, 2008 at 13:35 Comment(4)
I specifically mentioned that I need a List implementation. Trust me, there's a reason.Ironbark
Is the reason because you're interacting with an API that is taking a List as a parameter (instead of a Collection)? That's a bit annoying to have to deal withNestornestorian
Actually the API takes a Map<AccountType, Map<AccountType, List<Account>>>, which means holding somewhere in the vicinity of dozens to hundreds of lists... bah.Ironbark
Constructing probability functions with element-probability pairs can involve not have duplicates, although duplicate elements could just be merged.Aerometer
H
0

Off the top of my head, lists allow duplicates. You could quickly implement a UniqueArrayList and override all the add / insert functions to check for contains() before you call the inherited methods. For personal use, you could only implement the add method you use, and override the others to throw an exception in case future programmers try to use the list in a different manner.

Harbor answered 6/11, 2008 at 13:40 Comment(1)
I was ready to fall back to this idea (which eventually I had to) if no one suggested anything better =8-) See my own answer above.Ironbark
C
-1

in add method, why not using HashSet.add() to check duplicates instead of HashSet.consist(). HashSet.add() will return true if no duplicate and false otherwise.

Cockpit answered 13/1, 2012 at 13:7 Comment(1)
What is HashSet#consist() ?Intensity
M
-1

What about this? Just check the list before adding with a contains for an already existing object

while (searchResult != null && searchResult.hasMore()) {
    SearchResult nextElement = searchResult.nextElement();
    Attributes attributes = nextElement.getAttributes();

    String stringName = getAttributeStringValue(attributes, SearchAttribute.*attributeName*);
   
   if(!List.contains(stringName)){
    List.add(stringName);
   }
}
Magnetite answered 2/12, 2021 at 11:0 Comment(1)
If we wanted to implement it ourselves, we wouldn't askArsenite
M
-3

I just made my own UniqueList in my own little library like this:

package com.bprog.collections;//my own little set of useful utilities and classes

import java.util.HashSet;
import java.util.ArrayList;
import java.util.List;
/**
*
* @author Jonathan
*/
public class UniqueList {

private HashSet masterSet = new HashSet();
private ArrayList growableUniques;
private Object[] returnable;

public UniqueList() {
    growableUniques = new ArrayList();
}

public UniqueList(int size) {
    growableUniques = new ArrayList(size);
}

public void add(Object thing) {
    if (!masterSet.contains(thing)) {
        masterSet.add(thing);
        growableUniques.add(thing);
    }
}

/**
 * Casts to an ArrayList of unique values
 * @return 
 */
public List getList(){
    return growableUniques;
}

public Object get(int index) {
    return growableUniques.get(index);
}

public Object[] toObjectArray() {
    int size = growableUniques.size();
    returnable = new Object[size];
    for (int i = 0; i < size; i++) {
        returnable[i] = growableUniques.get(i);
    }
    return returnable;
    }
}

I have a TestCollections class that looks like this:

package com.bprog.collections;
import com.bprog.out.Out;
/**
*
* @author Jonathan
*/
public class TestCollections {
    public static void main(String[] args){
        UniqueList ul = new UniqueList();
        ul.add("Test");
        ul.add("Test");
        ul.add("Not a copy");
        ul.add("Test"); 
        //should only contain two things
        Object[] content = ul.toObjectArray();
        Out.pl("Array Content",content);
    }
}

Works fine. All it does is it adds to a set if it does not have it already and there's an Arraylist that is returnable, as well as an object array.

Municipality answered 6/10, 2011 at 19:22 Comment(1)
Yeah, you should add a bit more methods to it for implementing the List interface.Mame

© 2022 - 2024 — McMap. All rights reserved.