What is a "fat pointer"?
Asked Answered
M

2

167

I've read the term "fat pointer" in several contexts already, but I'm not sure what exactly it means and when it is used in Rust. The pointer seems to be twice as large as a normal pointer, but I don't understand why. It also seems to have something to do with trait objects.

Minorite answered 2/9, 2019 at 9:56 Comment(4)
The term itself is not Rust-specific, BTW. Fat pointer generally refers to a pointer that stores some extra data besides just the address of the object being pointed to. If the pointer contains some tag bits and depending on those tag bits, the pointer sometimes isn't a pointer at all, it is called a tagged pointer representation. (E.g. on many Smalltalks VMs, pointers that end with a 1 bit are actually 31/63-bit integers, since pointers are word-aligned and thus never end in 1.) The HotSpot JVM calls its fat pointers OOPs (Object-Oriented Pointers).Crutcher
Just a suggestion: when I post a Q&A pair I normally write a small note explaining that it is a self-answered question, and why I decided to post it. Have a look at the footnote in the question here: https://mcmap.net/q/145468/-selecting-null-what-is-the-reason-behind-selectall-null-in-d3/5768908Crake
@GerardoFurtado I initially posted a comment here explaining exactly that. But it was removed now (not by me). But yes, I agree, often such a note is useful!Minorite
@Jörg W Mittag ‘ordinary object pointers’, in factPenny
M
221

The term "fat pointer" is used to refer to references and raw pointers to dynamically sized types (DSTs) – slices or trait objects. A fat pointer contains a pointer plus some information that makes the DST "complete" (e.g. the length).

Most commonly used types in Rust are not DSTs but have a fixed size known at compile time. These types implement the Sized trait. Even types that manage a heap buffer of dynamic size (like Vec<T>) are Sized, as the compiler knows the exact number of bytes a Vec<T> instance will take up on the stack. There are currently four different kinds of DSTs in Rust.


Slices ([T] and str)

The type [T] (for any T) is dynamically sized (so is the special "string slice" type str). That's why you usually only see it as &[T] or &mut [T], i.e. behind a reference. This reference is a so-called "fat pointer". Let's check:

dbg!(size_of::<&u32>());
dbg!(size_of::<&[u32; 2]>());
dbg!(size_of::<&[u32]>());

This prints (with some cleanup):

size_of::<&u32>()      = 8
size_of::<&[u32; 2]>() = 8
size_of::<&[u32]>()    = 16

So we see that a reference to a normal type like u32 is 8 bytes large, as is a reference to an array [u32; 2]. Those two types are not DSTs. But as [u32] is a DST, the reference to it is twice as large. In the case of slices, the additional data that "completes" the DST is simply the length. So one could say the representation of &[u32] is something like this:

struct SliceRef { 
    ptr: *const u32, 
    len: usize,
}

Trait objects (dyn Trait)

When using traits as trait objects (i.e. type erased, dynamically dispatched), these trait objects are DSTs. Example:

trait Animal {
    fn speak(&self);
}

struct Cat;
impl Animal for Cat {
    fn speak(&self) {
        println!("meow");
    }
}

dbg!(size_of::<&Cat>());
dbg!(size_of::<&dyn Animal>());

This prints (with some cleanup):

size_of::<&Cat>()        = 8
size_of::<&dyn Animal>() = 16

Again, &Cat is only 8 bytes large because Cat is a normal type. But dyn Animal is a trait object and therefore dynamically sized. As such, &dyn Animal is 16 bytes large.

In the case of trait objects, the additional data that completes the DST is a pointer to the vtable (the vptr). I cannot fully explain the concept of vtables and vptrs here, but they are used to call the correct method implementation in this virtual dispatch context. The vtable is a static piece of data that basically only contains a function pointer for each method. With that, a reference to a trait object is basically represented as:

struct TraitObjectRef {
    data_ptr: *const (),
    vptr: *const (),
}

(This is different from C++, where the vptr for abstract classes is stored within the object. Both approaches have advantages and disadvantages.)


Custom DSTs

It's actually possible to create your own DSTs by having a struct where the last field is a DST. This is rather rare, though. One prominent example is std::path::Path.

A reference or pointer to the custom DST is also a fat pointer. The additional data depends on the kind of DST inside the struct.


Exception: Extern types

In RFC 1861, the extern type feature was introduced. Extern types are also DSTs, but pointers to them are not fat pointers. Or more exactly, as the RFC puts it:

In Rust, pointers to DSTs carry metadata about the object being pointed to. For strings and slices this is the length of the buffer, for trait objects this is the object's vtable. For extern types the metadata is simply (). This means that a pointer to an extern type has the same size as a usize (ie. it is not a "fat pointer").

But if you are not interacting with a C interface, you probably won't ever have to deal with these extern types.




Above, we've seen the sizes for immutable references. Fat pointers work the same for mutable references, immutable raw pointers and mutable raw pointers:

size_of::<&[u32]>()       = 16
size_of::<&mut [u32]>()   = 16
size_of::<*const [u32]>() = 16
size_of::<*mut [u32]>()   = 16
Minorite answered 2/9, 2019 at 9:56 Comment(5)
I'd like to see some more information on these "advantages and disadvantages" between the ways C++ and Rust store the vtbl and vptr.Postmeridian
@Postmeridian Explaining this in my answer would be fairly off-topic. Some information on the topic: here and here. You could also consider asking here on SO, if no such question exists.Minorite
I read that String, is also a pointer with length and capacity. Does this make string also a fat pointer?Ballocks
@Ballocks I see where you're coming from, but not exactly. First, String also has a capacity field. But more importantly, if we say "a fat pointer/reference is a pointer/reference to a DST", then again, no for String. The pointer in there is basically *mut u8, but u8 is not a DST. Of course, semantically, you are kind of right. One can think like that. However, in the language lawyer sense, the answer is "no".Minorite
@Postmeridian A "fat pointer" isn't really a pointer, just a struct with pointer(s) inside it. In C++ everything is manual - you can put the vtable in your objects, or pass around a trait like Rust (this is popular in libraries stemming from C), but you always have to define the "fat pointer" struct yourself!Nitty
A
2

Expanding on @Lukas's answer regarding the vtable or virtual tables. Vtables are primarily used with trait objects. The vtable contains pointers to the trait's methods for the concrete type implementing the trait. For example:

struct Sheep {}
struct Cow {}

trait Animal {
    fn method(&self) -> String;
}

impl Animal for Sheep {
    fn method(&self) -> String {
        "baaaaah!".to_string()
    }
}

impl Animal for Cow {
    fn method(&self) -> String {
        "moooooo!".to_string()
    }
}

fn random_animal1(random_string: &str) -> &dyn Animal {
    match random_string {
        "sheep" => &Sheep {},
        "cow" => &Cow {},
        _ => panic!(),
    }
}

fn random_animal2(random_string: &str) -> Box<dyn Animal> {
    match random_string {
        "sheep" => Box::new(Sheep {}),
        "cow" => Box::new(Cow {}),
        _ => panic!(),
    }
}

There is no way to tell which concrete types will be used that are implementing the Animal trait. Hence vtable is to dynamically determine which method to call based on the actual type of the object at runtime.

Whenever we use Trait Objects, the dyn keyword, Rust generates a vtable associated with that trait. It enables polymorphism by using trait objects without knowing the specific types. There are boxed Box<dyn Trait> and reference &dyn Trait Trait Object:

enter image description here

Trait objects use a combination of a data pointer (ptr) and a vtable pointer (vptr). The major difference is that the boxed trait object is allocated on the heap and Box owns it. While &dyn is a borrowed reference that points to a trait object. It does not have ownership over the underlying object. The object it references must be owned by some other entity, and it borrows it temporarily.

Atony answered 3/10, 2023 at 8:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.