I'm just asking why Rust decided to use &str
for string literals instead of String
. Isn't it possible for Rust to just automatically convert a string literal to a String
and put it on the heap instead of putting it into the stack?
To understand the reasoning, consider that Rust wants to be a systems programming language. In general, this means that it needs to be (among other things) (a) as efficient as possible and (b) give the programmer full control over allocations and deallocations of heap memory. One use case for Rust is for embedded programming where memory is very limited.
Therefore, Rust does not want to allocate heap memory where this is not strictly necessary. String literals are known at compile time and can be written into the ro.data
section of an executable/library, so they don't consume stack or heap space.
Now, given that Rust does not want to allocate the values on the heap, it is basically forced to treat string literals as &str
: String
s own their values and can be moved and dropped, but how do you drop a value that is in ro.data
? You can't really do that, so &str
is the perfect fit.
Furthermore, treating string literals as &str
(or, more accurately &'static str
) has all the advantages and none of the disadvantages. They can be used in multiple places, can be shared without worrying about using heap memory and never have to be deleted. Also, they can be converted to owned String
s at will, so having them available as String
is always possible, but you only pay the cost when you need to.
ro.data
cannot be dropped? Could Rust just pretend it is dropped and carry on or would that cause problems? (Edit: I'm actually wondering why str exists at all, and string literals seem to be an important part of the answer.) –
Groschen ro.data
(though it might also be difficult cross platform) and then avoid the drop, but its making the implementation much more complicated. The types String
and str
have their equivalent in Vec<T>
and [T]
. With rust's model of ownership and shared references you really need something like &str
, not just because of string literals. –
Parham String
owns and can modify its data, not really something you want to do with a string literal / thing in ro.data
. –
Parham String
? I don't really know much about reverse engineering too, but I always want the program to be harder to reverse engineer. –
Edora To create a String
, you have to:
- reserve a place on the heap (allocate), and
- copy the desired content from a read-only location to the freshly allocated area.
If a string literal like "foo"
did both, every string would effectively be allocated twice: once inside the executable as the read-only string, and the other time on the heap. You simply couldn't just refer to the original read-only data stored in the executable.
&str
literals give you access to the most efficient string data: the one present in the executable image on startup, put there by the compiler along with the instructions that make up the program. The data it points to is not stored on the stack, what is stack-allocated is just the pointer/size pair, as is the case with any Rust slice.
Making "foo"
desugar into what is now spelled "foo".to_owned()
would make it slower and less space-efficient, and would likely require another syntax to get a non-allocating &str
. After all, you don't want x == "foo"
to allocate a string just to throw it away immediately. Languages like Python alleviate this by making their strings immutable, which allows them to cache strings mentioned in the source code. In Rust mutating String
is often the whole point of creating it, so that strategy wouldn't work.
String
is guaranteed to refer to heap-allocated data. You can even convert it to Box<str>
and Vec<u8>
without reallocation. I would assume that to be possible as long as the String is not mut. - there is no such thing as a non-mut String
- as long as you own it, you can always make it mut. –
Tattan let mut s = s
reuses the same memory location, so it indeed effectively makes the original String mut (instead of e.g. copying the data). So things will break if the non-mut String could refer to read-only data, something I didn't expect, thanks again. –
Groschen String
is bound to be implemented as triple of (pointer, capacity, length), and moving it just copies those three values and marks the old ones as dead, so the compiler doesn't try to Drop
them. –
Tattan str
in place, to see what would happen if those are called on a string literal, Apparently these do exist, e.g. make_ascii_uppercase()
, which requires a &mut str
. But the only way I've been able to create a &mut str
is to copy a &str
through a heap-allocated structure like a String
or a Box
. Because let mut s = s;
does not work on &str
because 'it is behind an & reference'. So it does seem impossible to call such modify-in-place functions on a string literal. –
Groschen © 2022 - 2024 — McMap. All rights reserved.