Java equivalent of Python 'in' - for set membership test?
Asked Answered
J

2

14

I want to check if an item exists in an item set.

I want to do this in java:

def is_item_in_set(item, item_set):
    return item in item_set

I've managed writing this:

boolean isItemInSet(String item, String[] itemSet) {
    for(int i =0; i < itemSet.length; ++i) {
        if(item.equals(itemSet[i])) {
            return true;
        }
    }
    return false;
}

Is there a better way for testing set-membership in Java?

Juglandaceous answered 31/3, 2013 at 8:51 Comment(10)
Those two pieces of code are not equivalent.Brusquerie
Are they now? Did you refer to the .contains()? I've replaced it with .equals().Juglandaceous
There is an anti-pattern in your Python, why not just return item in item_set? They are also not the same as the latter example is not as efficient, it's O(n) while the Python example is O(1).Bogie
Well, not to mention that true and false are not value in Python unless you define them yourself ;)Ovovitellin
@Ovovitellin Very True.Bogie
Yeah, the logic remains, though. :)Juglandaceous
@Lattyware, not so much an antipattern as an abominationGentleness
@Lattyware: Hrm, the first might be O(n) depending on the type of item_set!Ovovitellin
@Lattyware: "They are also not the same as the latter example is not as efficient" - My exact point why I asked the question!!Juglandaceous
@Ovovitellin In Python, one tends to name things semantically as typing isn't enforced - if something is called x_set it should be a set. Of course, you are right, it could well not be.Bogie
O
16

You can't do it with a straight array, but you can with a Set<T> by calling .contains. If you feel like you will be doing a lot of isItemInSet calls, consider using Sets instead of arrays -- you will be much happier.

For example, using a HashSet<T> makes isItemInSet an O(1) operation (on average). Set insertion and deletion are also similarly fast. Indeed, a HashSet<T> in Java is essentially the same as a Python set() (similar underlying concept and performance characteristics) -- you will see a big improvement in speed with many calls to query, insert or delete on the set.

Ovovitellin answered 31/3, 2013 at 8:55 Comment(11)
Note that Set<T> is an interface. You will want to use an implementing subclass like HashSet<T> (recommended for most uses).Ovovitellin
:) Shall accept as soon as I get my HashSet<T> up and running!Juglandaceous
It would be nice if you could point out how much performance gains I might get. Thank you! :)Juglandaceous
@Suhas - it depends on too many variables to give a simple answer.Edme
@StephenC I guess I should research more, then. Hmm. :)Juglandaceous
If you're not using a set in python you should do that too!Fatigue
@jedwards: Under certain assumptions, HashSet offers constant-time performance. The assumption is that the hash function evenly distributes output values, which is generally true for the built-in hashCode functions.Ovovitellin
@Ovovitellin - You also need to factor in 1) the time taken to build the HashSet, 2) the extra space to store the HashSet, 3) the size of the HashSet, 4) the cost of the element's equals method ... if you are going to answer the "how much" question.Edme
@StephenC: Eh? I believe we were discussing only the time efficiency gain of isItemInSet. These other questions relate to other aspect. The time taken to build it is clearly linear (but no longer than it takes to build the array), the extra space is linear, the size of the hashset would be the same size as the original array, and the element's equals method cost is a red herring because any container (even an array) suffers that same cost.Ovovitellin
@Ovovitellin - We are talking about the big picture. The total equation of how much you save by using HashSet versus (say) an array to represent a set. And even the space usage affects the (time) cost because of the effect it can have on GC times. The big picture is very complicated. And equals is not a red herring because different data structures require different numbers of calls to equals.Edme
Sets are so verbose to construct that I would just go with Arrays.asList("a", "b", "c").contains() in most cases, despite the impurity of it. If only Set.of were null safe.Explicate
A
1

With Java 9 (and later versions) around the corner and using Set as specified by @nneonneo, we can achieve membership test in one line like below:

Set.of(item1, item2, item3).contains(reqItem)
Ailment answered 25/2, 2019 at 10:46 Comment(1)
Unfortunately, this throws an NPE if reqItem is null, making Set.of a pretty useless API for most things I do.Explicate

© 2022 - 2024 — McMap. All rights reserved.