std::lower_bound and std::find on a plain array
Asked Answered
M

4

19

I like to use std::algorithm whenever I can on plain arrays. Now I have 2 doubts; suppose I want to use std::lower_bound what happens if the value I provide as argument is not found?

int a[] = {1,2,3,4,5,6};
int* f = std::lower_bound(a,a+6,20);

The result I have when printing *f is 20.

The same happens if I use std::find.

int a[] = {1,2,3,4,5,6};
int* f = std::find(a,a+6,20);

The result I have when printing *f is 20.

  1. Is it always the case that the return value is the original argument when this is not found?
  2. In terms of performance std::lower_bound performs better of std::find since it implements a binary search algorithm. If the array is big say max 10 elements, could std::find perform better? Behind the scenes std::lower_bound calls std::advance and std::distance ..maybe I might save on these calls as well?

Thanks a lot

AFG

Megganmeggi answered 1/6, 2012 at 15:53 Comment(6)
The result you have is 20? You mean f? Or *f? If it's f, that's irrelevant, and unlikely. If it's *f, you shouldn't be dereferencing f there.Hills
I correct my question in "the result I have when printing *f" is 20. Is there anyway to test that that element was not in the lower bound/find result?Megganmeggi
Concerning your performance questions: What do you want to accomplish? You seem to forgo using STL containers while at the same time using STL functions and worrying about their complexity. I am curious.Backset
@AbruzzoForteeGentile: Test: if (f == a+6)Hills
If this test is safe even if for arrays allocated in stack I am expecting good peformance exploiting this usage. Thanks a lot.Megganmeggi
Re (1), of course not; if that were guaranteed, then if your value were not found, the stdlib would have to dynamically allocate a new copy of it & return a pointer with a (by definition) unknown ownership... which would be absurd on multiple levels. As others have mentioned, what you really got was an end iterator, which you're not allowed to dereference - but the particular flavour of undefined behaviour you got by dereferencing just happened to return the 20 off the stack (but that's uninteresting because it's UB, unrelated objects need not be stored consecutively, padding exists, etc.)Maulmain
H
19

The result I have is 20. (Later edited to: The result I have when printing *f is 20.)

No, the result you get is a+6. Dereferencing that invokes undefined behavior. It might print 20, it might print "Shirley MacLaine", or it might blow up your car.

Is it always the case that the return value is the original argument when this is not found?

The return value will always be the 2nd argument in your case, because 20 is larger than any other value in the array. If the value is not found, but smaller than some existing value, the return value points to the next larger item.

From cppreference.com, the return value of std::lower_bound is "iterator pointing to the first element that is not less than value, or last if no such element is found."

In terms of performance ...

Measure it. No other advice here will stand up to your actual empirical evidence.

Behind the scenes std::lower_bound calls std::advance and std::distance ..maybe I might save on these calls as well?

Almost certainly not. Those calls are almost certainly optimized in your case to single (or very few) instructions.

Hairpin answered 1/6, 2012 at 15:56 Comment(0)
C
7

In your example you must not dereference f, because it's equal to a+6. You have anyway, so you're in UB territory, but I suppose that the value 20 happens to be on the stack immediately after the array a.

It's true that for small enough arrays, a linear search might be faster than a binary search. 10 is "small", not "big". If you have a program which is doing a lot of searches into small arrays, you can time each one and see.

There should be basically no overhead for std::advance and std::distance - any halfway competent C++ compiler will inline everything, and they'll turn into pointer addition and subtraction.

Classicize answered 1/6, 2012 at 15:57 Comment(0)
K
7

There's one significant difference between the iterators returned by lower_bound and find. If lower_bound doesn't find the item, it will return the iterator where the item should be inserted to retain the sort order. If find doesn't find the item, it will return the ending iterator (i.e. the second argument to find). In your example since you're trying to find something off the end of the array, both return the same iterator - but that's a complete coincidence.

Kush answered 1/6, 2012 at 16:7 Comment(0)
P
4

You could use the following implementation

int a[] = {1,2,3,4,5,6};

int f = lower_bound(a,a+6,20)-a;

Now if 20 is present in the array it will return the index of the element in the array a (0-based indexing used). If 20 is not present in the array it will return 6 i.e. the length of the array.

In worst case the item to be searched is present at (n-1)th index [when n is the size of the array]. Then f will be n-1.

f will be n or equal to the size of the array only when the item being searched is not present in the array.

Hope it answers your question.

Personage answered 4/6, 2018 at 4:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.