C++ map lower_bound/upper_bound
Asked Answered
B

3

5

I understand that the underlying data structure for map in C++ is a self-balancing Binary Search Tree. Since in these data structures, finding a lower bound and an upper bound to a key has lots of use, you'd think that map lower_bound, and upper_bound functions will give you that capability. It's a bummer that these functions don't deliver that. Does anyone know why lower_bound behaves the way it does? (it gives you the key that is NOT BEFORE the given key).

Bogosian answered 13/4, 2020 at 17:29 Comment(3)
nusure what you expected... std::prev(m.lower_bound(key)) (ignoring bound check).Reciprocate
Because that is how upper and lower bound work in math? en.wikipedia.org/wiki/Upper_and_lower_boundsNerynesbit
I don't think the mathematical definition makes a huge relevance here. but don't want to get stuck in that discussion. My main point is that I have a lot of usage to find an element with largest key, smaller than a given key. The way it is right now, I have to find the lower_bound and back-iterate to find that.Bogosian
F
11

I've been using C++ since before SGI even introduced the STL, and for some reason still manage to mess up using these methods, including even embarrassing myself when presenting them to a class. I think my problems are that:

  1. The names already have an intuitive but different meaning in mathematics. Given the mathematical meanings, it seems weird that in a big set or map, upper_bound and lower_bound are actually the same or adjacent elements.

  2. The names "upper_bound" and "lower_bound" sound like there is a some kind of symmetry between the two, when there is absolutely not. I'd have a much easier time if the names were something like least_ge (least greater than or equal to) for lower_bound and least_gt (least greater than) for upper_bound.

If someone has a mnemonic or logic to make these easy to internalize, please share it. Otherwise, it just feels like they wrote two useful functions but used two random mathematical terms to name those functions, with no way to derive the semantics from the names. At that point, why not use up made up names like egptr and pbase? I mean at least I don't have any pre-existing intuitions to overcome about the names of streambuf methods...

At any rate here are what I believe are the basic rules you have to remember:

  • lower_bound(X) returns the lowest element v such that v >= X

  • upper_bound(X) returns the lowest element v such that v > X

  • To traverse the half-open interval [L,H), start with lower_bound(L) and stop at (don't process) lower_bound(H). This is usually what you want, because it's most common to traverse half-open intervals in C++--e.g., [buf, buf+nbytes), or [0,array_size), or [begin(), end()).

  • To traverse the closed interval [L,H], start at lower_bound(L) and stop at upper_bound(H).

  • To traverse the open interval (L,H), start at upper_bound(L) and stop at lower_bound(H).

  • In a non-empty container, the mirror image of lower_bound(X) is std::prev(upper_bound(X)) and the mirror image of upper_bound(X) is std::prev(lower_bound(X)). Of course, if an element is equal to begin(), then you can't step it backwards with std::prev, so you need extra logic to deal with the fact that this point cannot be represented with an iterator value.

  • In a multiset/multimap, the first v is lower_bound(v) if that element is indeed v. The last v is std::prev(upper_bound(v)) if the container is not empty and that element is v, but remember to check that the container is not empty before attempting prev on end().

Ferdinandferdinanda answered 15/5, 2021 at 22:8 Comment(3)
cppreference.com's std::equal_range description got some hints: "...The first iterator may be alternatively obtained with std::lower_bound(), the second - with std::upper_bound()." Also its 'possible implementation' uses these functions such as make_pair(lower_bound(...), upper_bound(...)). @WangXiaojie's answer mentioned this as well. Also, see the answers to this question.Typecast
Seems like the root cause of the confusion is the uniformity of the interfaces. The terms "lower_bound" and "upper_bound" came from a more general concept, not only for the typical std::map. std::map contains unique keys - so there is no "range" of multiple equivalent elements and thus it's weird to say "lower"/"upper" bound. But std::map has these functions with the same names for interface uniformity and convenience.Typecast
@starriet주녕차 For me the main confusion is that c.lower_bound(x) can be greater than x. This is counter to uses of the term in other contexts (e.g., lattices, complexity theory, order theory) where elements can never be less than their lower bound. E.g., if you look up these terms by reading wikipedia on bounds en.wikipedia.org/wiki/Upper_and_lower_bounds then C++ is guaranteed to confuse you.Ferdinandferdinanda
M
2

From the point view of usual math convention, the upper_bound is the "least true upper-bound" (never equal) and the lower_bound is the "least upper-bound" (could equal). The fact that lower_bound is actually an "upper-bound" in usual math convention may cause confusion among users.

A way to rationalize the name of lower_bound/upper_bound is considering them in the context of another method called equal_range. The lower_bound is really the "lower_bound" of the equal_range, similarly the upper_bound.

Mythify answered 12/8, 2021 at 15:25 Comment(0)
G
1

This is not only in the map. It is in STL.

lower_bound for your x find such y that x <= y. And the upper_bound x < y.

Goolsby answered 13/4, 2020 at 17:53 Comment(1)
Yeah, not quite sure what's the use of that. I was hoping to be able to give a key and find the largest key smaller than the given key. Map being implemented as an RB-Tree justifies this feature even more.Bogosian

© 2022 - 2024 — McMap. All rights reserved.