I've always known that reference type variables are stored in the heap while value type variables are stored in the stack.
This is only partially true in Swift. In general, Swift makes no guarantees about where objects and values are stored, except that:
- Reference types have a stable location in memory, so that all references to the same object point to exactly the same place, and
- Value types are not guaranteed to have a stable location in memory, and can be copied arbitrarily as the compiler sees fit
This technically means that object types can be stored on the stack if the compiler knows that an object is created and destructed within the same stack frame with no escaping references to it, but in practice, you can basically assume that all objects are allocated on the heap.
For value types, the story is a little more complicated:
- Unless a location-based reference is required of a value (e.g., taking a reference to a struct with
&
), a struct may be located entirely in registers: operating on small structs may place its members in CPU registers so it never even lives in memory. (This is especially the case for small, possibly short-lived value types like Int
s and Double
s, which are guaranteed to fit in registers)
- Large value types do actually get heap-allocated: although this is an implementation detail of Swift that theoretically could change in the future, structs which are larger than 3 machine words (e.g., larger than 12 bytes on a 32-bit machine, or 24 bytes on a 64-bit machine) are pretty much guaranteed to be allocated and stored on the heap. This doesn't conflict with the value-ness of a value type: it can still be copied arbitrarily as the compiler wishes, and the compiler does a really good job of avoiding unnecessary allocations where it can
So where are ints, doubles, strings, etc. are kept when they are defined inside a class, aka reference type?
This is an excellent question that gets at the heart of what a value type is. One way to think of the storage of a value type is inline, wherever it needs to be. Imagine a
struct Point {
var x: Double
var y: Double
}
structure, which is laid out in memory. Ignoring the fact that Point
itself is a struct for a second, where are x
and y
stored relative to Point
? Well, inline wherever Point
goes:
┌───────────┐
│ Point │
├─────┬─────┤
│ x │ y │
└─────┴─────┘
When you need to store a Point
, the compiler ensures that you have enough space to store both x
and y
, usually one immediately following the other. If a Point
is stored on the stack, then x
and y
are stored on the stack, one after the other; if Point
is stored on the heap, then x
and y
live on the heap as part of Point
. Wherever Swift places a Point
, it always ensures you have enough space, and when you assign to x
and y
, they are written to that space. It doesn't terribly matter where that is.
And when Point
is part of another object? e.g.
class Location {
var name: String
var point: Point
}
Then Point
is also laid out inline wherever it is stored, and its values are laid out inline as well:
┌──────────────────────┐
│ Location │
├──────────┬───────────┤
│ │ Point │
│ name ├─────┬─────┤
│ │ x │ y │
└──────────┴─────┴─────┘
In this case, when you create a Location
object, the compiler ensures that there's enough space to store a String
and two Double
s, and lays them out one after another. Where that is, again, doesn't matter, but in this case, it's all on the heap (because Location
is a reference type, which happens to contain values).
As for the other way around, object storage has to components:
- The variable you use to access the object, and
- The actual storage for the object
Let's say that we changed Point
from being a struct to being a class. When before, Location
stored the contents of Point
directly, now, it only stores a reference to their actual storage in memory:
┌──────────────────────┐ ┌───────────┐
│ Location │ ┌───▶│ Point │
├──────────┬───────────┤ │ ├─────┬─────┤
│ name │ point ──┼─┘ │ x │ y │
└──────────┴───────────┘ └─────┴─────┘
Before, when Swift laid out space to create a Location
, it was storing one String
and two Double
s; now, it stores one String
and one pointer to a Point
. Unlike in languages like C or C++, you don't actually need to be aware of the fact that Location.point
is now a pointer, and it doesn't actually change how you access the object; but under the hood, the size and "shape" of Location
has changed.
The same goes for storing all other reference types, including closures. A variable holding a closure is largely just a pointer to some metadata for the closure, and a way to execute the closure's code (though the specifics of this are out of scope for this answer):
┌───────────────────────────────┐ ┌───────────┐
│ MyStruct │ │ closure │
├─────────┬─────────┬───────────┤ ┌──▶│ storage │
│ prop1 │ prop2 │ closure ─┼─┘ │ + code │
└─────────┴─────────┴───────────┘ └───────────┘