Count elements lower than a given value in a std::set
Asked Answered
D

6

7

I need to find how many elements are lower than a given one in a std::set.

I thought the right function to use was std::lower_bound which returns an iterator to the first element which is greater or equal to the given one....so the index of this iterator is what I'm looking for...but I can't find the index from the iterator:

#include <iostream>
#include <algorithm>
#include <set>

int main()
{
    std::set<int> mySet;
    mySet.insert( 1 );
    mySet.insert( 2 );
    mySet.insert( 3 );
    mySet.insert( 4 );

    std::set<int>::const_iterator found = std::lower_bound( mySet.begin(), mySet.end(), 2 );

    if ( found != mySet.end() )
        std::cout << "Value 2 was found at position " << ( found - mySet.begin() ) << std::endl;
else
        std::cout << "Value 2 was not found" << std::endl;
}

This does not compile:

16:63: error: no match for 'operator-' (operand types are 'std::set<int>::const_iterator {aka std::_Rb_tree_const_iterator<int>}' and 'std::set<int>::iterator {aka std::_Rb_tree_const_iterator<int>}')
16:63: note: candidates are:
In file included from /usr/include/c++/4.9/vector:65:0,
                 from /usr/include/c++/4.9/bits/random.h:34,
                 from /usr/include/c++/4.9/random:49,
                 from /usr/include/c++/4.9/bits/stl_algo.h:66,
                 from /usr/include/c++/4.9/algorithm:62,
                 from 3:

Using std::vector instead of std::set works perfectly.

Looks like operator- is not valid for a std::set::iterator. Why? Then, how can you easily (without calling std::previous or std::next untill bound is reached...this would not be efficient) find the position of a given iterator in the container? If you can't, then what alterantive can I use to find the index of a given element...?

Dillingham answered 9/3, 2016 at 14:22 Comment(0)
S
3

Looks like operator- is not valid for a std::set::iterator. Why?

Indeed, an implementation of std::set::iterator::operator-() can't exist in constant complexity since the elements are not contiguous in memory.


Then, how can you easily (without calling std::previous or std::next until bound is reached...this would not be efficient) find the position of a given iterator in the container?

You can't, std::set::iterator is not a RandomAccessIterator. See std::distance() documentation:

Complexity

Linear.


If you can't, then what alterantive can I use to find the index of a given element...?

I'd suggest to count your elements without having to compute an iterator distance: std::count_if() can help us:

#include <iostream>
#include <algorithm>
#include <set>

int main()
{
    std::set<int> mySet;
    mySet.insert( 1 );
    mySet.insert( 2 );
    mySet.insert( 3 );
    mySet.insert( 4 );

    const std::size_t lower_than_three = std::count_if(
         std::begin(mySet)
        , std::end(mySet)
        , [](int elem){ return elem < 3; } );
    std::cout << lower_than_three << std::endl;    
}

Demo

Saline answered 9/3, 2016 at 14:28 Comment(0)
I
4

The correct way to do a lower bound search is with std::set's own lower_bound function, which is specially designed to work with this sorted, associative, non-random-access container.

So, instead of this:

std::lower_bound( mySet.begin(), mySet.end(), 2 );

use this:

mySet.lower_bound(2);

This is logarithmic in the size of the container, which is much better than a std::count_if approach (which doesn't know about the sortyness of the comparator, and therefore must visit all the nodes and is thus linear).

However, you then must also use std::distance from the beginning to the lower bound, which is not only linear but also necessarily "slow" in practice (due to the non-random-access).

Nathan's solution seems optimal given that you do not want to simply find the lower bound, but find its distance from the "start" of the container.

Inventor answered 25/4, 2019 at 15:18 Comment(3)
I never noticed std::[multi]set (and map) had their own lower_bound :)Saline
@YSC: Magic! :DInventor
That's an interesting answers.Dillingham
S
3

Looks like operator- is not valid for a std::set::iterator. Why?

Indeed, an implementation of std::set::iterator::operator-() can't exist in constant complexity since the elements are not contiguous in memory.


Then, how can you easily (without calling std::previous or std::next until bound is reached...this would not be efficient) find the position of a given iterator in the container?

You can't, std::set::iterator is not a RandomAccessIterator. See std::distance() documentation:

Complexity

Linear.


If you can't, then what alterantive can I use to find the index of a given element...?

I'd suggest to count your elements without having to compute an iterator distance: std::count_if() can help us:

#include <iostream>
#include <algorithm>
#include <set>

int main()
{
    std::set<int> mySet;
    mySet.insert( 1 );
    mySet.insert( 2 );
    mySet.insert( 3 );
    mySet.insert( 4 );

    const std::size_t lower_than_three = std::count_if(
         std::begin(mySet)
        , std::end(mySet)
        , [](int elem){ return elem < 3; } );
    std::cout << lower_than_three << std::endl;    
}

Demo

Saline answered 9/3, 2016 at 14:28 Comment(0)
L
2

Since std::set::iterator is a BidirectionalIterator we cannot subtract from it unless we use the decrement operator. What we can do though is just walk the set and count the iterations until we reach a number bigger than what we are looking for.

std::set<int> mySet;
// fill values
int counter = 0;
for (auto it = mySet.begin(), *it < some_value && it != mySet.end(); ++it)
{
    if (e < some_value)
        counter++;
}

This is a worst mySet.size() iterations which is as fast as you can get it when dealing with a BidirectionalIterator.

Also note that std::lower_bound does not have O(log N) complexity since we are not using a RandomAccessIterator. When using a non-RandomAccessIterator it has linear complexity.

Loan answered 9/3, 2016 at 14:34 Comment(0)
P
1

You can use following code for this:

#include <algorithm>
#include <set>
#include <iostream>

int main()
{
    std::set<int> mySet;
    mySet.insert( 1 );
    mySet.insert( 2 );
    mySet.insert( 3 );
    mySet.insert( 4 );

    std::set<int>::const_iterator found = std::lower_bound( mySet.begin(), mySet.end(), 2 );
    std::size_t dist = std::distance(found, mySet.end());
    std::cout << "Number of lower bound elements: " << dist << std::endl;
}
Pleo answered 9/3, 2016 at 14:29 Comment(1)
Please forget about std::lower_bound with set/map iterators, it is o(n). Use set.lower_bound instead.Athalee
C
1

Expanding on all the existing answers - you can always write your own operator-.

template<class T, class = typename 
    std::enable_if<
    std::is_same<
    typename T::iterator_category,
    std::bidirectional_iterator_tag
>::value>::type>
typename std::iterator_traits<T>::difference_type operator-(const T& a, const T& b)
{
    return std::distance(b, a);
}
Cutshall answered 9/3, 2016 at 14:52 Comment(1)
Nice approach! Thanks.Dillingham
S
0

It is possible to use an augmented set data structure. Finding the position of a key can be done in O(log n) time using C++ Policy based data structures. At the cost of storing sub-tree size in each node, it is possible to get the order of a key (order_of_key) or the key by order (find_by_order) in O(log n) time.

The example below shows the snippet from the question using an ordered set with tree order statistics.

#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
#include <ext/pb_ds/detail/standard_policies.hpp>
#include <functional>
using namespace __gnu_pbds;

typedef tree<int, null_type, less<int>, rb_tree_tag, tree_order_statistics_node_update> ordered_set;


int main()
{
    ordered_set mySet;
    mySet.insert( 1 );
    mySet.insert( 2 );
    mySet.insert( 3 );
    mySet.insert( 4 );

    auto it = mySet.find(2);

    if ( found != mySet.end() )
        std::cout << "Value 2 was found at position " << mySet.order_of_key(2) << std::endl;
else
        std::cout << "Value 2 was not found" << std::endl;
    return 0;
}

See the following links for more details:

Schnozzle answered 29/7, 2024 at 21:54 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.