Situations where Cell or RefCell is the best choice
Asked Answered
V

3

44

When would you be required to use Cell or RefCell? It seems like there are many other type choices that would be suitable in place of these, and the documentation warns that using RefCell is a bit of a "last resort".

Is using these types a "code smell"? Can anyone show an example where using these types makes more sense than using another type, such as Rc or even Box?

Visser answered 14/6, 2015 at 15:16 Comment(4)
Rc and Box solve different classes of problems: they are used when the size of an object is unknown or too large to store inline, while Cell and RefCell provide interior mutability, in order to work around inherited mutability.Headsail
@FrancisGagné I'm a little unclear on what "inherited mutability" means or why it is important or a problem. Can you clarify?Visser
What about inspecting all places where they are used in a good code base, e.g. the compiler itself?Canikin
I think Rc<RefCell<T>> is a code smell, yes. I won't go so far as to call Rc and RefCell smelly by themselves, but they're often used together, and I think most of the times they're used together there are better approaches available. If the state of your program doesn't really need to be a full-blown graph, then it's best to refactor things into a simple ownership tree. But if the state of your program really is a graph, then it probably contains cycles, and Rc<RefCell<T>> is going to cause memory leaks and panics. See jacko.io/object_soup.html.Samoyed
H
56

It is not entirely correct to ask when Cell or RefCell should be used over Box and Rc because these types solve different problems. Indeed, more often than not RefCell is used together with Rc in order to provide mutability with shared ownership. So yes, use cases for Cell and RefCell are entirely dependent on the mutability requirements in your code.

Interior and exterior mutability are very nicely explained in the official Rust book, in the designated chapter on mutability. External mutability is very closely tied to the ownership model, and mostly when we say that something is mutable or immutable we mean exactly the external mutability. Another name for external mutability is inherited mutability, which probably explains the concept more clearly: this kind of mutability is defined by the owner of the data and inherited to everything you can reach from the owner. For example, if your variable of a structural type is mutable, so are all fields of the structure in the variable:

struct Point { x: u32, y: u32 }

// the variable is mutable...
let mut p = Point { x: 10, y: 20 };
// ...and so are fields reachable through this variable
p.x = 11;
p.y = 22;

let q = Point { x: 10, y: 20 };
q.x = 33;  // compilation error

Inherited mutability also defines which kinds of references you can get out of the value:

{
    let px: &u32 = &p.x;  // okay
}
{
    let py: &mut u32 = &mut p.x;  // okay, because p is mut
}
{
    let qx: &u32 = &q.x;  // okay
}
{
    let qy: &mut u32 = &mut q.y;  // compilation error since q is not mut
}

Sometimes, however, inherited mutability is not enough. The canonical example is reference-counted pointer, called Rc in Rust. The following code is entirely valid:

{
    let x1: Rc<u32> = Rc::new(1);
    let x2: Rc<u32> = x1.clone();  // create another reference to the same data
    let x3: Rc<u32> = x2.clone();  // even another
}  // here all references are destroyed and the memory they were pointing at is deallocated

At the first glance it is not clear how mutability is related to this, but recall that reference-counted pointers are called so because they contain an internal reference counter which is modified when a reference is duplicated (clone() in Rust) and destroyed (goes out of scope in Rust). Hence Rc has to modify itself even though it is stored inside a non-mut variable.

This is achieved via internal mutability. There are special types in the standard library, the most basic of them being UnsafeCell, which allow one to work around the rules of external mutability and mutate something even if it is stored (transitively) in a non-mut variable.

Another way to say that something has internal mutability is that this something can be modified through a &-reference - that is, if you have a value of type &T and you can modify the state of T which it points at, then T has internal mutability.

For example, Cell can contain Copy data and it can be mutated even if it is stored in non-mut location:

let c: Cell<u32> = Cell::new(1);
c.set(2);
assert_eq!(c.get(), 2);

RefCell can contain non-Copy data and it can give you &mut pointers to its contained value, and absence of aliasing is checked at runtime. This is all explained in detail on their documentation pages.


As it turned out, in overwhelming number of situations you can easily go with external mutability only. Most of existing high-level code in Rust is written that way. Sometimes, however, internal mutability is unavoidable or makes the code much clearer. One example, Rc implementation, is already described above. Another one is when you need shared mutable ownership (that is, you need to access and modify the same value from different parts of your code) - this is usually achieved via Rc<RefCell<T>>, because it can't be done with references alone. Even another example is Arc<Mutex<T>>, Mutex being another type for internal mutability which is also safe to use across threads.

So, as you can see, Cell and RefCell are not replacements for Rc or Box; they solve the task of providing you mutability somewhere where it is not allowed by default. You can write your code without using them at all; and if you get into a situation when you would need them, you will know it.

Cells and RefCells are not code smell; the only reason why they are described as "last resort" is that they move the task of checking mutability and aliasing rules from the compiler to the runtime code, as in case with RefCell: you can't have two &muts pointing to the same data at the same time, this is statically enforced by the compiler, but with RefCells you can ask the same RefCell to give you as much &muts as you like - except that if you do it more than once it will panic at you, enforcing aliasing rules at runtime. Panics are arguably worse than compilation errors because you can only find errors causing them at runtime rather than at compilation time. Sometimes, however, the static analyzer in the compiler is too restrictive, and you indeed do need to "work around" it.

Heedless answered 14/6, 2015 at 16:46 Comment(1)
The chapter on mutability was a good thing to revisit for this. The important part to draw from this is that Cell / RefCell allow you to "emulate field-level mutability". It's similar to having a struct's field marked as mut, if that were a possible. Thanks for the detailed answer, examples, and relevant documentation links!Visser
M
15

No, Cell and RefCell aren't "code smells". Normally, mutability is inherited, that is you can mutate a field or a part of a data structure if and only if you have exclusive access to of the whole data structure, and hence you can opt into mutability at that level with mut (i.e., foo.x inherits its mutability or lack thereof from foo). This is a very powerful pattern and should be used whenever it works well (which is surprisingly often). But it's not expressive enough for all code everywhere.

Box and Rc have nothing to do with this. Like almost all other types, they respect inherited mutability: you can mutate the contents of a Box if you have exclusive, mutable access to the Box (because that means you have exclusive access to the contents, too). Conversely, you can never get a &mut to the contents of an Rc because by its nature Rc is shared (i.e. there can be multiple Rcs referring to the same data).

One common case of Cell or RefCell is that you need to share mutable data between several places. Having two &mut references to the same data is normally not allowed (and for good reason!). However, sometimes you need it, and the cell types enable doing it safely.

This could be done via the common combination of Rc<RefCell<T>>, which allows the data to stick around for as long as anyone uses it and allows everyone (but only one at a time!) to mutate it. Or it could be as simple as &Cell<i32> (even if the cell is wrapped in a more meaningful type). The latter is also commonly used for internal, private, mutable state like reference counts.

The documentation actually has several examples of where you'd use Cell or RefCell. A good example is actually Rc itself. When creating a new Rc, the reference count must be increased, but the reference count is shared between all Rcs, so, by inherited mutability, this couldn't possibly work. Rc practically has to use a Cell.

A good guideline is to try writing as much code as possible without cell types, but using them when it hurts too much without them. In some cases, there is a good solution without cells, and, with experience, you'll be able to find those when you previously missed them, but there will always be things that just aren't possible without them.

Mediocrity answered 14/6, 2015 at 16:36 Comment(1)
this does not answer the question One common case of Cell or RefCell is that you need to share mutable data between several places yes but when is this?Sorrells
S
15

Suppose you want or need to create some object of the type of your choice and dump it into an Rc.

let x = Rc::new(5i32);

Now, you can easily create another Rc that points to the exact same object and therefore memory location:

let y = x.clone();
let yval: i32 = *y;

Since in Rust you may never have a mutable reference to a memory location to which any other reference exists, these Rc containers can never be modified again.

So, what if you wanted to be able to modify those objects and have multiple Rc pointing to one and the same object?

This is the issue that Cell and RefCell solve. The solution is called "interior mutability", and it means that Rust's aliasing rules are enforced at runtime instead of compile-time.

Back to our original example:

let x = Rc::new(RefCell::new(5i32));
let y = x.clone();

To get a mutable reference to your type, you use borrow_mut on the RefCell.

let yval = x.borrow_mut();
*yval = 45;

In case you already borrowed the value your Rcs point to either mutably or non-mutably, the borrow_mut function will panic, and therefore enforce Rust's aliasing rules.

Rc<RefCell<T>> is just one example for RefCell, there are many other legitimate uses. But the documentation is right. If there is another way, use it, because the compiler cannot help you reason about RefCells.

Scrawl answered 14/6, 2015 at 16:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.