Check if Vec contains all elements from another Vec
Asked Answered
L

4

17

There is a method contains that can be used to check if a particular element exists in a Vec. How to check if all elements from a Vec are contained in another Vec? Is there something more concise than iterating manually and checking all elements explicitly?

Laetitia answered 6/10, 2020 at 13:13 Comment(6)
other_vector.iter().all(|e| vector.contains(e)) doesn't seem too bad.Ironist
An other alternative would be to load both in a HashSet or BTreeSet and use is_subset/is_superset. That would have higher constant overhead, but it should be linear rather than quadratic which is probably a good idea if the collections are quite large.Wakeful
@Ironist actually, but almost all languages I have been worked with had something out-of-the-box. So I asked in case I missed something and there's some method for that in Rust.Laetitia
Really? I don't know a single language supporting a subset check for a vector-like datastructure out of the box. For set datastructures, sure, but not for a vector.Elery
@SvenMarnach At least this oneLaetitia
I'm indeed not familiar with Java's standard library.Elery
S
34

You have two main choices:

  • naively check each element from one vector to see if it's in the other. This has time complexity O(n^2) but it's also very simple and has low overhead:

    assert!(b.iter().all(|item| a.contains(item)));
    
  • create a set of all of elements of one of the vectors, and then check if elements of the other are contained it it. This has O(n) time complexity, but higher overhead including an extra heap allocation:

    let a_set: HashSet<_> = a.iter().copied().collect();
    assert!(b.iter().all(|item| a_set.contains(item)));
    

Which one is "better" will depend on your requirements. If you only care about speed, the better choice will still depend on the number of elements in your vectors, so you should test both with realistic data. You could also test with a BTreeSet, which has different performance characteristics from HashSet.


Here are some rough benchmarks (source) for how the implementations vary with the size of the input. In all tests, b is half the size of a and contains a random subset of a's elements:

Size of a Vec::contains HashSet::contains BtreeSet::contains
10 14 386 327
100 1,754 3,187 5,371
1000 112,306 31,233 88,340
10000 2,821,867 254,801 728,268
100000 29,207,999 2,645,703 6,611,666

Times in nanoseconds.

The naive O(n^2) solution is fastest when the number of elements is small. The overhead of allocating a HashSet or BTreeSet is overshadowed by the impact of the number of comparisons when the size is more than about 200. BTreeSet is mostly a lot slower than HashSet, but is slightly faster when the number of elements is very small.

Sea answered 6/10, 2020 at 14:8 Comment(0)
D
2

If you have sorted vectors, you can do the search in linear time:

    let mut vec = vec![0, 2, 4, 3, 6, 3, 5, 1, 0];
    let mut v = vec![1, 4, 3, 3, 1];

    vec.sort_unstable();
    v.sort_unstable();

    // Remove duplicates elements in v
    v.dedup();

    let mut vec_iter = vec.iter();
    assert!(v.iter().all(|&x| vec_iter.any(|&item| item == x)));

Reference: C++ has std::includes which does exactly that.

Dyslogistic answered 1/7, 2021 at 18:45 Comment(2)
This will give incorrect results if v has repeated elements - or at least different results. It is correct if the requirement is that the other vec contains the same number of each item.Sea
I have edited my answer to handle the case of duplicate elements in v.Dyslogistic
P
0

Adding on to the other answers here: depending on how your Vecs are set up, you may be able to save yourself some comparison by comparing their lengths. The vec a can only contain b if its length is greater than or equal to b's. (AND b has no duplicates.)

assert!(a.len() >= b.len());
Processional answered 16/10, 2023 at 3:31 Comment(0)
Z
-4

You could also sort the vectors and then test them for equality:

fn main() {
    let mut v1 = vec![2, 3, 1];
    let mut v2 = vec![3, 1, 2];
    
    v1.sort();
    v2.sort();

    assert_eq!(v1, v2);
}
Zelaya answered 6/10, 2020 at 13:18 Comment(1)
Note that they're asking about subsetting, not just equality e.g. v1 might have just 1 and 3 but not 2.Wakeful

© 2022 - 2024 — McMap. All rights reserved.