RAII and smart pointers in C++
Asked Answered
C

6

212

In practice with C++, what is RAII, what are smart pointers, how are these implemented in a program and what are the benefits of using RAII with smart pointers?

Carpometacarpus answered 27/12, 2008 at 16:13 Comment(0)
S
346

A simple (and perhaps overused) example of RAII is a File class. Without RAII, the code might look something like this:

File file("/path/to/file");
// Do stuff with file
file.close();

In other words, we must make sure that we close the file once we've finished with it. This has two drawbacks - firstly, wherever we use File, we will have to called File::close() - if we forget to do this, we're holding onto the file longer than we need to. The second problem is what if an exception is thrown before we close the file?

Java solves the second problem using a finally clause:

try {
    File file = new File("/path/to/file");
    // Do stuff with file
} finally {
    file.close();
}

or since Java 7, a try-with-resource statement:

try (File file = new File("/path/to/file")) {
   // Do stuff with file
}

C++ solves both problems using RAII - that is, closing the file in the destructor of File. So long as the File object is destroyed at the right time (which it should be anyway), closing the file is taken care of for us. So, our code now looks something like:

File file("/path/to/file");
// Do stuff with file
// No need to close it - destructor will do that for us

This cannot be done in Java since there's no guarantee when the object will be destroyed, so we cannot guarantee when a resource such as file will be freed.

Onto smart pointers - a lot of the time, we just create objects on the stack. For instance (and stealing an example from another answer):

void foo() {
    std::string str;
    // Do cool things to or using str
}

This works fine - but what if we want to return str? We could write this:

std::string foo() {
    std::string str;
    // Do cool things to or using str
    return str;
}

So, what's wrong with that? Well, the return type is std::string - so it means we're returning by value. This means that we copy str and actually return the copy. This can be expensive, and we might want to avoid the cost of copying it. Therefore, we might come up with idea of returning by reference or by pointer.

std::string* foo() {
    std::string str;
    // Do cool things to or using str
    return &str;
}

Unfortunately, this code doesn't work. We're returning a pointer to str - but str was created on the stack, so we be deleted once we exit foo(). In other words, by the time the caller gets the pointer, it's useless (and arguably worse than useless since using it could cause all sorts of funky errors)

So, what's the solution? We could create str on the heap using new - that way, when foo() is completed, str won't be destroyed.

std::string* foo() {
    std::string* str = new std::string();
    // Do cool things to or using str
    return str;
}

Of course, this solution isn't perfect either. The reason is that we've created str, but we never delete it. This might not be a problem in a very small program, but in general, we want to make sure we delete it. We could just say that the caller must delete the object once he's finished with it. The downside is that the caller has to manage memory, which adds extra complexity, and might get it wrong, leading to a memory leak i.e. not deleting object even though it is no longer required.

This is where smart pointers come in. The following example uses shared_ptr - I suggest you look at the different types of smart pointers to learn what you actually want to use.

shared_ptr<std::string> foo() {
    shared_ptr<std::string> str = new std::string();
    // Do cool things to or using str
    return str;
}

Now, shared_ptr will count the number of references to str. For instance

shared_ptr<std::string> str = foo();
shared_ptr<std::string> str2 = str;

Now there are two references to the same string. Once there are no remaining references to str, it will be deleted. As such, you no longer have to worry about deleting it yourself.

Quick edit: as some of the comments have pointed out, this example isn't perfect for (at least!) two reasons. Firstly, due to the implementation of strings, copying a string tends to be inexpensive. Secondly, due to what's known as named return value optimisation, returning by value may not be expensive since the compiler can do some cleverness to speed things up.

So, let's try a different example using our File class.

Let's say we want to use a file as a log. This means we want to open our file in append only mode:

File file("/path/to/file", File::append);
// The exact semantics of this aren't really important,
// just that we've got a file to be used as a log

Now, let's set our file as the log for a couple of other objects:

void setLog(const Foo & foo, const Bar & bar) {
    File file("/path/to/file", File::append);
    foo.setLogFile(file);
    bar.setLogFile(file);
}

Unfortunately, this example ends horribly - file will be closed as soon as this method ends, meaning that foo and bar now have an invalid log file. We could construct file on the heap, and pass a pointer to file to both foo and bar:

void setLog(const Foo & foo, const Bar & bar) {
    File* file = new File("/path/to/file", File::append);
    foo.setLogFile(file);
    bar.setLogFile(file);
}

But then who is responsible for deleting file? If neither delete file, then we have both a memory and resource leak. We don't know whether foo or bar will finish with the file first, so we can't expect either to delete the file themselves. For instance, if foo deletes the file before bar has finished with it, bar now has an invalid pointer.

So, as you may have guessed, we could use smart pointers to help us out.

void setLog(const Foo & foo, const Bar & bar) {
    shared_ptr<File> file = new File("/path/to/file", File::append);
    foo.setLogFile(file);
    bar.setLogFile(file);
}

Now, nobody needs to worry about deleting file - once both foo and bar have finished and no longer have any references to file (probably due to foo and bar being destroyed), file will automatically be deleted.

Stratton answered 27/12, 2008 at 16:57 Comment(8)
It should be noted that many string implementations are implemented in terms of a reference counted pointer. These copy-on-write semantics make returning a string by value really inexpensive.Richert
Even for the ones that are not, many compilers implement NRV optimization which would take care of the overhead. In general, I find shared_ptr rarely useful - just stick with RAII and avoid shared ownership.Farmelo
returning a string isn't a good reason for using smart pointers really. return value optimization can easily optimize out the return, and c++1x move semantics will eliminate a copy altogether (when used correctly). Show some real world example (for example when we share the same resource) instead :)Bimonthly
I think your conclusion early on about why Java can't do this lacks clarity. The easiest way to describe this limitation in Java or C# is because there is no way to allocate on the stack. C# allows stack allocation through a special keyword however, you lose type saftey.Woodyard
@Nemanja Trifunovic: By RAII in this context you mean returning copies/creating objects on the stack? That doesn't work if you have return/accept objects of types that can be subclassed. Then you have to use a pointer to avoid slicing the object, and I'd argue that a smart pointer is often better than a raw one in those cases.Satrap
@Michael, So, do you mean that smart pointers utilize RAII? Or is there any other way to implement smart pointers without RAII?Ethnarch
This is kind of misleading, because "if we forget to do this, we're holding onto the file longer than we need to" is actually holding the file until the app terminates. Which is most probably much longer then one needs to :)Hyatt
@Michael, I think you wrote The reason is that we've created str, but we never delete it when you dynamically allocated memory inside a function and returned copy of pointer to that memory. I think we still could deallocate that memory outside of the function through the copy of pointer, because copy of pointer and the pointer inside the function would point to the same location on heapChairmanship
B
149

RAII This is a strange name for a simple but awesome concept. Better is the name Scope Bound Resource Management (SBRM). The idea is that often you happen to allocate resources at the begin of a block, and need to release it at the exit of a block. Exiting the block can happen by normal flow control, jumping out of it, and even by an exception. To cover all these cases, the code becomes more complicated and redundant.

Just an example doing it without SBRM:

void o_really() {
     resource * r = allocate_resource();
     try {
         // something, which could throw. ...
     } catch(...) {
         deallocate_resource(r);
         throw;
     }
     if(...) { return; } // oops, forgot to deallocate
     deallocate_resource(r);
}

As you see there are many ways we can get pwned. The idea is that we encapsulate the resource management into a class. Initialization of its object acquires the resource ("Resource Acquisition Is Initialization"). At the time we exit the block (block scope), the resource is freed again.

struct resource_holder {
    resource_holder() {
        r = allocate_resource();
    }
    ~resource_holder() {
        deallocate_resource(r);
    }
    resource * r;
};

void o_really() {
     resource_holder r;
     // something, which could throw. ...
     if(...) { return; }
}

That is nice if you have got classes of their own which are not solely for the purpose of allocating/deallocating resources. Allocation would just be an additional concern to get their job done. But as soon as you just want to allocate/deallocate resources, the above becomes unhandy. You have to write a wrapping class for every sort of resource you acquire. To ease that, smart pointers allow you to automate that process:

shared_ptr<Entry> create_entry(Parameters p) {
    shared_ptr<Entry> e(Entry::createEntry(p), &Entry::freeEntry);
    return e;
}

Normally, smart pointers are thin wrappers around new / delete that just happen to call delete when the resource they own goes out of scope. Some smart pointers, like shared_ptr allow you to tell them a so-called deleter, which is used instead of delete. That allows you, for instance, to manage window handles, regular expression resources and other arbitrary stuff, as long as you tell shared_ptr about the right deleter.

There are different smart pointers for different purposes:

unique_ptr

is a smart pointer which owns an object exclusively. It's not in boost, but it will likely appear in the next C++ Standard. It's non-copyable but supports transfer-of-ownership. Some example code (next C++):

Code:

unique_ptr<plot_src> p(new plot_src); // now, p owns
unique_ptr<plot_src> u(move(p)); // now, u owns, p owns nothing.
unique_ptr<plot_src> v(u); // error, trying to copy u

vector<unique_ptr<plot_src>> pv; 
pv.emplace_back(new plot_src); 
pv.emplace_back(new plot_src);

Unlike auto_ptr, unique_ptr can be put into a container, because containers will be able to hold non-copyable (but movable) types, like streams and unique_ptr too.

scoped_ptr

is a boost smart pointer which is neither copyable nor movable. It's the perfect thing to be used when you want to make sure pointers are deleted when going out of scope.

Code:

void do_something() {
    scoped_ptr<pipe> sp(new pipe);
    // do something here...
} // when going out of scope, sp will delete the pointer automatically. 

shared_ptr

is for shared ownership. Therefor, it's both copyable and movable. Multiple smart pointer instances can own the same resource. As soon as the last smart pointer owning the resource goes out of scope, the resource will be freed. Some real world example of one of my projects:

Code:

shared_ptr<plot_src> p(new plot_src(&fx));
plot1->add(p)->setColor("#00FF00");
plot2->add(p)->setColor("#FF0000");
// if p now goes out of scope, the src won't be freed, as both plot1 and 
// plot2 both still have references. 

As you see, the plot-source (function fx) is shared, but each one has a separate entry, on which we set the color. There is a weak_ptr class which is used when code needs to refer to the resource owned by a smart pointer, but doesn't need to own the resource. Instead of passing a raw pointer, you should then create a weak_ptr. It will throw an exception when it notices you try to access the resource by an weak_ptr access path, even though there is no shared_ptr anymore owning the resource.

Bimonthly answered 27/12, 2008 at 22:30 Comment(5)
As far as I know non-copyable objects are not good to use at all in stl containers as they rely on value semantics - what happens if you want to sort that container? sort does copy elements...Aliphatic
C++0x containers will be changed so that it respects move-only types like unique_ptr, and sort will be changed likewise too.Bimonthly
Do you remember where you first heard the term SBRM? James is trying to track it down.Vat
which headers or libraries should I include to use these? any further readings on this?Tetrapod
One advice here: if there is an answer to a C++ question by @litb, it is the right answer (no matter the votes or the answer flagged as "correct")...Antananarivo
I
33

The premise and reasons are simple, in concept.

RAII is the design paradigm to ensure that variables handle all needed initialization in their constructors and all needed cleanup in their destructors. This reduces all initialization and cleanup to a single step.

C++ does not require RAII, but it is increasingly accepted that using RAII methods will produce more robust code.

The reason that RAII is useful in C++ is that C++ intrinsically manages the creation and destruction of variables as they enter and leave scope, whether through normal code flow or through stack unwinding triggered by an exception. That's a freebie in C++.

By tying all initialization and cleanup to these mechanisms, you are ensured that C++ will take care of this work for you as well.

Talking about RAII in C++ usually leads to the discussion of smart pointers, because pointers are particularly fragile when it comes to cleanup. When managing heap-allocated memory acquired from malloc or new, it is usually the responsibility of the programmer to free or delete that memory before the pointer is destroyed. Smart pointers will use the RAII philosophy to ensure that heap allocated objects are destroyed any time the pointer variable is destroyed.

Inky answered 27/12, 2008 at 17:12 Comment(1)
In addition - pointers are the most common application of RAII - you'll likely allocate thousands of times more pointers than any other resource.Chinchin
A
8

Smart pointer is a variation of RAII. RAII means resource acquisition is initialization. Smart pointer acquires a resource (memory) before usage and then throws it away automatically in a destructor. Two things happen:

  1. We allocate memory before we use it, always, even when we don't feel like it -- it's hard to do another way with a smart pointer. If this wasn't happening you will try to access NULL memory, resulting in a crash (very painful).
  2. We free memory even when there's an error. No memory is left hanging.

For instance, another example is network socket RAII. In this case:

  1. We open network socket before we use it,always, even when we don't feel like -- it's hard to do it another way with RAII. If you try doing this without RAII you might open empty socket for, say MSN connection. Then message like "lets do it tonight" might not get transferred, users will not get laid, and you might risk getting fired.
  2. We close network socket even when there's an error. No socket is left hanging as this might prevent the response message "sure ill be on bottom" from hitting sender back.

Now, as you can see, RAII is a very useful tool in most cases as it helps people to get laid.

C++ sources of smart pointers are in millions around the net including responses above me.

Angelitaangell answered 27/12, 2008 at 17:8 Comment(0)
E
2

Boost has a number of these including the ones in Boost.Interprocess for shared memory. It greatly simplifies memory management, especially in headache-inducing situations like when you have 5 processes sharing the same data structure: when everyone's done with a chunk of memory, you want it to automatically get freed & not have to sit there trying to figure out who should be responsible for calling delete on a chunk of memory, lest you end up with a memory leak, or a pointer which is mistakenly freed twice and may corrupt the whole heap.

Eby answered 27/12, 2008 at 17:22 Comment(0)
R
0
void foo()
{
   std::string bar;
   //
   // more code here
   //
}

No matter what happens, bar is going to be properly deleted once the scope of the foo() function has been left behind.

Internally std::string implementations often use reference counted pointers. So the internal string only needs to be copied when one of the copies of the strings changed. Therefore a reference counted smart pointer makes it possible to only copy something when necessary.

In addition, the internal reference counting makes it possible that the memory will be properly deleted when the copy of the internal string is no longer needed.

Richert answered 27/12, 2008 at 16:23 Comment(4)
void f() { Obj x; } Obj x gets deleted by means of stack frame creation/destruction (unwinding)... it's not related to ref counting.Wench
The reference counting is a feature of the internal implementation of the string. RAII is the concept behind object deletion when the object goes out of scope. The question was about RAII and also smart pointers.Richert
"No matter what happens"--what happens if an exception is thrown before the function returns?Attendance
Which function is returned? If an exception is thrown in foo, than bar is deleted. The default constructor of bar throwing an exception would be an extraordinary event.Richert

© 2022 - 2024 — McMap. All rights reserved.