I will try to give a different perspective. In Rust there is a general convention: if you have a variable of some type T
, it means that you own the data associated with T
. If you have a variable of type &T
, then you don't own the data.
Now let's consider a heap-allocated string. According to this convention, there should be a non-reference type that represents ownership of the allocation. And indeed such a type exists: String
.
There is also a different kind of strings: &'static str
. These strings are not owned by anyone: exactly one instance of string is placed inside the compiled binary file, and only pointers are passed around. There is no allocation and no deallocation, hence no ownership. In a sense, static strings are owned by the compiler, not by a programmer. This is why String
can not be used to represent a static string.
Alright, so why not use &String
to represent a static string? Imagine a world where the following code is a valid Rust:
let s: &'static String = "hello, world!";
This might look fine, but implementation-wise, this is suboptimal:
String
itself has a pointer to the actual data, so &String
has to be basically a pointer to a pointer. This violates zero-cost abstraction principle: why do we introduce an excessive level of indirection, when actually the compiler statically knows the address of "hello, world!"
?
Even if somehow the compiler was smart enough to decide that an excessive pointer is not needed here (which would lead to a bunch of other problems), still String
itself contains three 8-byte fields:
- Data pointer;
- Data length;
- Allocation capacity - lets us know how much free space there is after the data.
However, when we are talking about static strings, capacity makes zero sense: static strings are read-only.
So, in the end, when the compiler sees &'static String
, we actually want it to store only a data pointer and length - otherwise, we are paying for what we will never use, which is against zero-cost abstraction principle. This looks like an arcane wizardry that we want from the compiler: the variable type is &String
but the variable itself is anything but a reference to String
.
To make this work, we actually need a different type, not &String
, that only holds a data pointer and length. And here it is: &str
! It is better than &String
in a number of ways:
- Does not have an excessive level of indirection - only one pointer;
- Does not store capacity, which would be meaningless in many contexts;
- No black magic: we define
str
as a variable-sized type (the data itself), so &str
is just a reference to the data.
Now you might wonder: why not introduce str
instead of &str
? Remeber the convention: having str
would imply that you own the data, which you don't. Hence &str
.
&str
means it's a reference (AKA pointer). A typical compiler will put all string constants (multiple bytes) in the binary, What get's passed around are just the pointers to this data. That's the same in C where you normally don't have achar[32]
but achar*
as the type of your variable. – Cloakroomstr
is the only primitive that does not implementCopy
. So for integer types,bool
etc it simply makes no sense to have them be a reference or have a lifetime since they can be owned by the context by copying it from the executable without runtime issues. This is not the case withstr
however, since it is not cheap to clone, you'll almost always want a reference to it or make it a full-blown string that can be properly modified. – DippyCopy
, including slices and mutable references, and arrays and tuples if the elements of those don't implement it. – Leonteen