Why use non-member begin and end functions in C++11?
Asked Answered
D

7

224

Every standard container has a begin and end method for returning iterators for that container. However, C++11 has apparently introduced free functions called std::begin and std::end which call the begin and end member functions. So, instead of writing

auto i = v.begin();
auto e = v.end();

you'd write

auto i = std::begin(v);
auto e = std::end(v);

In his talk, Writing Modern C++, Herb Sutter says that you should always use the free functions now when you want the begin or end iterator for a container. However, he does not go into detail as to why you would want to. Looking at the code, it saves you all of one character. So, as far as the standard containers go, the free functions seem to be completely useless. Herb Sutter indicated that there were benefits for non-standard containers, but again, he didn't go into detail.

So, the question is what exactly do the free function versions of std::begin and std::end do beyond calling their corresponding member function versions, and why would you want to use them?

Dumbbell answered 29/9, 2011 at 6:0 Comment(3)
It's one fewer character, save those dots for your children: xkcd.com/297Forestaysail
I 'd somehow hate to use them because I 'd have to repeat std:: all the time.Leafage
@MichaelChourdakis: Apparently you don't. See the first example here: en.cppreference.com/w/cpp/algorithm/findPericycle
U
178

How do you call .begin() and .end() on a C-array ?

Free-functions allow for more generic programming because they can be added afterwards, on a data-structure you cannot alter.

Unclasp answered 29/9, 2011 at 6:7 Comment(16)
Except that you how you would you get end on a C array? They don't have their length as part of them. begin is as easy passing the pointer, and end is as easy as passing the pointer plus the length. So, begin seems pointless, and you can't have end - unless end can take a length to it (Herb didn't give such an example), in which case, you're still not getting anything over just passing the pointer plus the length. So, begin and end are still pointless.Dumbbell
However, for C strings, i.e. char*, you could look for the \0 character, although that is a rather slow approach compared to storing the length.Lucretialucretius
@JonathanMDavis: you can have the end for statically declared arrays (int foo[5]) using template programming tricks. Once it has decayed to a pointer, you're of course out of luck.Unclasp
template<typename T, size_t N> T* end(T (&a)[N]) { return a + N; }Fettle
@JonathanMDavis: As the others indicated, it is certainly possible to get begin and end on a C array as long as you haven't already decayed it to a pointer yourself - @Huw spells it out. As for why you'd want to: imagine that you refactored code that was using an array to use a vector (or vice-versa, for whatever reason). If you've been using begin and end, and perhaps some clever typedeffing, the implementation code won't have to change at all (except perhaps some of the typedefs).Laris
@JonathanMDavis: Arrays are not pointers. And for everyone: For the sake of ending this ever-prominent confusion, stop referring to (some) pointers as "decayed arrays". There's no such terminology in the language, and there really isn't a use for it. Pointers are pointers, arrays are arrays. Arrays can be converted to a pointer to their first element implicitly, but the is still just a regular old pointer, with no distinction with others. Of course you can't get the "end" of a pointer, case closed.Anni
@MatthieuM I suspect that its a linguistic problem, but whether the array is static or not is irrelevant. You can use end on any C style array, static or local. You can't use it on a pointer.Locale
@JamesKanze: I was intending to make a distinction between plain arrays (size known at compile time) and variable-length arrays (since gcc allows those even in C++). The latter don't work with templates.Unclasp
@MatthieuM I'll admit that VLAs hadn't occurred to me. (Things like end() are one of the reasons C++ didn't adopt them. g++ is doing users a disservice to support them in C++.)Locale
Well, other than arrays there are a large number of APIs which expose container like aspects. Obviously you can't modify a 3rd party API but you can easily write these free standing begin/end functions.Extensible
@james: In theory g++ could support VLAs with end, since the value that end should have is well defined as &arr[0]+sizeof(arr)/sizeof(*arg). Addmittedly, though that would require special magic treatment of an std::end overload, but it could still be done. Special compiler magic is required for some library parts already...Gwin
@KevinCathcart: Computing the end is possible, but deducing the array type is not. It is not possible for a function to take an array as parameter. The closest (and only possibility to emulate this afaik) is to have a template taking a reference to an array (as shown by @Huw), however this requires a compile-time known size :/Unclasp
That is why such a function would need special compiler support to bypass the limitation. I do agree that there is no way to implement it without either special compiler magic, or some form of non-standard extention. Perhaps G++ could reuse the parameter forward declaration extention. Thus template<typename T> T* end(size_t N; T (&a)[N]) {return a+N;}. That would be treated as having one paramter in pretty much all respects, although in the generated code it would actually have two parameters, with the compiler supplying sizeof(arr)/sizeof(*arr) at the call site (unless inlined of course).Gwin
@Anni The word decay might not be in the language standard but I'm pretty sure everyone understands that it refers to how an array is treated when it is passed to a function. The same code that would work (e.g. begin(c_array)) within main() where the sizeof the c_array is known will not work in an function which is passed the c_array as a parameter. Which I'm sure you are aware of. The compiler will give a diagnostic something like: error: cannot build range expression with array function parameter 'c' since parameter with array type 'int []' is treated as pointer type 'int *'Columelliform
I don't know if i am right, using a non member function is more object oriented(encapsulation), isn't?Chantay
@Lexshard: So, the principle of least privilege applied to C++ translates to "Prefer non-member, non-friends, functions" because, indeed, those functions are least privilege and therefore are agnostic to the internal representation (better encapsulation). This principle is for the writer of the function, the users don't really get to choose. Here, the use of free-functions is for another reason: since C++ does not offer the ability to add member functions to built-in types (such as arrays), only free-functions can be used for them.Unclasp
F
37

Using the begin and end free functions adds one layer of indirection. Usually that is done to allow more flexibility.

In this case I can think of a few uses.

The most obvious use is for C-arrays (not c pointers).

Another is when trying to use a standard algorithm on a non-conforming container (ie the container is missing a .begin() method). Assuming you can't just fix the container, the next best option is to overload the begin function. Herb is suggesting you always use the begin function to promote uniformity and consistency in your code. Instead of having to remember which containers support method begin and which need function begin.

As an aside, the next C++ rev should copy D's pseudo-member notation. If a.foo(b,c,d) is not defined it instead tries foo(a,b,c,d). It's just a little syntactic sugar to help us poor humans who prefer subject then verb ordering.

Firecure answered 29/9, 2011 at 14:43 Comment(4)
The pseudo-member notation looks like C#/.Net extension methods. They do come in useful for various situations though - like all features - can be prone to 'abuse'.Lightness
The pseudo-member notation is a boon for coding with Intellisense; hitting "a." shows relevant verbs, freeing up brain power from memorizing lists, and helping to discover relevant API functions can help prevent duplicating functionality, without having to shoehorn non-member functions into classes.Externalize
There are proposals to get that into C++, which use the term Unified Function Call Syntax (UFCS).Kenner
Next C++ rev refers to? Was that implemented?Organize
T
36

Consider the case when you have library that contain class:

class SpecialArray;

it has 2 methods:

int SpecialArray::arraySize();
int SpecialArray::valueAt(int);

to iterate over it's values you need to inherit from this class and define begin() and end() methods for cases when

auto i = v.begin();
auto e = v.end();

But if you always use

auto i = begin(v);
auto e = end(v);

you can do this:

template <>
SpecialArrayIterator begin(SpecialArray & arr)
{
  return SpecialArrayIterator(&arr, 0);
}

template <>
SpecialArrayIterator end(SpecialArray & arr)
{
  return SpecialArrayIterator(&arr, arr.arraySize());
}

where SpecialArrayIterator is something like:

class SpecialArrayIterator
{
   SpecialArrayIterator(SpecialArray * p, int i)
    :index(i), parray(p)
   {
   }
   SpecialArrayIterator operator ++();
   SpecialArrayIterator operator --();
   SpecialArrayIterator operator ++(int);
   SpecialArrayIterator operator --(int);
   int operator *()
   {
     return parray->valueAt(index);
   }
   bool operator ==(SpecialArray &);
   // etc
private:
   SpecialArray *parray;
   int index;
   // etc
};

now i and e can be legally used for iteration and accessing of values of SpecialArray

Tinny answered 29/9, 2011 at 15:6 Comment(2)
This should not include the template<> lines. You are declaring a new function overload, not specializing a template.Gluttonous
I'm confused what overloading the increment and decrement operators is doing since the functions are just prototypes.Euphemiah
F
18

To answer your question, the free functions begin() and end() by default do nothing more than call the container's member .begin() and .end() functions. From <iterator>, included automatically when you use any of the standard containers like <vector>, <list>, etc., you get:

template< class C > 
auto begin( C& c ) -> decltype(c.begin());
template< class C > 
auto begin( const C& c ) -> decltype(c.begin()); 

The second part of you question is why prefer the free functions if all they do is call the member functions anyway. That really depends on what kind of object v is in your example code. If the type of v is a standard container type, like vector<T> v; then it doesn't matter if you use the free or member functions, they do the same thing. If your object v is more generic, like in the following code:

template <class T>
void foo(T& v) {
  auto i = v.begin();     
  auto e = v.end(); 
  for(; i != e; i++) { /* .. do something with i .. */ } 
}

Then using the member functions breaks your code for T = C arrays, C strings, enums, etc. By using the non-member functions, you advertise a more generic interface that people can easily extend. By using the free function interface:

template <class T>
void foo(T& v) {
  auto i = begin(v);     
  auto e = end(v); 
  for(; i != e; i++) { /* .. do something with i .. */ } 
}

The code now works with T = C arrays and C strings. Now writing a small amount of adapter code:

enum class color { RED, GREEN, BLUE };
static color colors[]  = { color::RED, color::GREEN, color::BLUE };
color* begin(const color& c) { return begin(colors); }
color* end(const color& c)   { return end(colors); }

We can get your code to be compatible with iterable enums too. I think Herb's main point is that using the free functions is just as easy as using the member functions, and it gives your code backward compatibility with C sequence types and forward compatibility with non-stl sequence types (and future-stl types!), with low cost to other developers.

Fauver answered 11/2, 2013 at 5:34 Comment(2)
Nice examples. I wouldn't take an enum or any other fundamental type by reference, though; they will be cheaper to copy than they are to indirect to.Kenner
Kinda outdated but have in mind that strings now have begin() and end() methods tooAdon
M
10

One benefit of std::begin and std::end is that they serve as extension points for implementing standard interface for external classes.

If you'd like to use CustomContainer class with range-based for loop or template function which expects .begin() and .end() methods, you'd obviously have to implement those methods.

If the class does provide those methods, that's not a problem. When it doesn't, you'd have to modify it*.

This is not always feasible, for example when using external library, esspecially commercial and closed source one.

In such situations, std::begin and std::end come in handy, since one can provide iterator API without modifying the class itself, but rather overloading free functions.

Example: suppose that you'd like to implement count_if function that takes a container instead of a pair of iterators. Such code might look like this:

template<typename ContainerType, typename PredicateType>
std::size_t count_if(const ContainerType& container, PredicateType&& predicate)
{
    using std::begin;
    using std::end;

    return std::count_if(begin(container), end(container),
                         std::forward<PredicateType&&>(predicate));
}

Now, for any class you'd like to use with this custom count_if, you only have to add two free functions, instead of modifying those classes.

Now, C++ has a mechanisim called Argument Dependent Lookup (ADL), which makes such approach even more flexible.

In short, ADL means, that when a compiler resolves an unqualified function (i. e. function without namespace, like begin instead of std::begin), it will also consider functions declared in namespaces of its arguments. For example:

namesapce some_lib
{
    // let's assume that CustomContainer stores elements sequentially,
    // and has data() and size() methods, but not begin() and end() methods:

    class CustomContainer
    {
        ...
    };
}

namespace some_lib
{    
    const Element* begin(const CustomContainer& c)
    {
        return c.data();
    }

    const Element* end(const CustomContainer& c)
    {
        return c.data() + c.size();
    }
}

// somewhere else:
CustomContainer c;
std::size_t n = count_if(c, somePredicate);

In this case, it doesn't matter that qualified names are some_lib::begin and some_lib::end - since CustomContainer is in some_lib:: too, compiler will use those overloads in count_if.

That's also the reason for having using std::begin; and using std::end; in count_if. This allows us to use unqualified begin and end, therefore allowing for ADL and allowing compiler to pick std::begin and std::end when no other alternatives are found.

We can eat the cookie and have the cookie - i. e. have a way to provide custom implementation of begin/end while the compiler can fall back to standard ones.

Some notes:

  • For the same reason, there are other similar functions: std::rbegin/rend, std::size and std::data.

  • As other answers mentions, std:: versions have overloads for naked arrays. That's useful, but is simply a special case of what I've described above.

  • Using std::begin and friends is particularly good idea when writing template code, because this makes those templates more generic. For non-template you might just as well use methods, when applicable.

P. S. I'm aware that this post is nearly 7 years old. I came across it because I wanted to answer a question which was marked as a duplicate and discovered that no answer here mentions ADL.

Mum answered 13/6, 2018 at 22:0 Comment(1)
Good answer, particularly explaining ADL overtly, rather than leaving it up to the imagination like everyone else did - even when they were showing it in action!Kenner
I
5

Whereas the non-member functions don't provide any benefit for the standard containers, using them enforces a more consistent and flexible style. If you at some time want to extend an existing non-std container class, you'd rather define overloads of the free functions, instead of altering the existing class's definition. So for non-std containers they are very useful and always using the free functions makes your code more flexible in that you can substitute the std container by a non-std container more easily and the underlying container type is more transparent to your code as it supports a much wider variety of container implementations.

But of course this always has to be weighted properly and over abstraction is not good either. Although using the free functions is not that much of an over-abstraction, it nevertheless breaks compatibility with C++03 code, which at this young age of C++11 might still be an issue for you.

Independent answered 30/9, 2011 at 12:37 Comment(2)
In C++03, you can just use boost::begin()/end(), so there's no real incompatibility :)Kelantan
@MarcMutz-mmutz Well, boost dependency is not always an option (and is quite an overkill if used only for begin/end). So I would regard that an incompatibility to pure C++03, too. But like said, it is a rather small (and getting smaller) incompatibility, as C++11 (at least begin/end in particular) is getting more and more adoption, anyway.Independent
C
0

Ultimately the benefit is in code that is generalized such that it's container agnostic. It can operate on a std::vector, an array, or a range without changes to the code itself.

Additionally, containers, even non-owned containers can be retrofitted such that they can also be used agnostically by code using non-member range based accessors.

See here for more detail.

Charioteer answered 28/4, 2020 at 17:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.