When I pass a string
to a function, is a pointer to the string's contents passed, or is the entire string passed to the function on the stack like a struct
would be?
A reference is passed; however, it's not technically passed by reference. This is a subtle, but very important distinction. Consider the following code:
void DoSomething(string strLocal)
{
strLocal = "local";
}
void Main()
{
string strMain = "main";
DoSomething(strMain);
Console.WriteLine(strMain); // What gets printed?
}
There are three things you need to know to understand what happens here:
- Strings are reference types in C#.
- They are also immutable, so any time you do something that looks like you're changing the string, you aren't. A completely new string gets created, the reference is pointed at it, and the old one gets thrown away.
- Even though strings are reference types,
strMain
isn't passed by reference. It's a reference type, but the reference itself is passed by value. Any time you pass a parameter without theref
keyword (not countingout
parameters), you've passed something by value.
So that must mean you're...passing a reference by value. Since it's a reference type, only the reference was copied onto the stack. But what does that mean?
Passing reference types by value: You're already doing it
C# variables are either reference types or value types. C# parameters are either passed by reference or passed by value. Terminology is a problem here; these sound like the same thing, but they're not.
If you pass a parameter of ANY type, and you don't use the ref
keyword, then you've passed it by value. If you've passed it by value, what you really passed was a copy. But if the parameter was a reference type, then the thing you copied was the reference, not whatever it was pointing at.
Here's the first line of the Main
method:
string strMain = "main";
We've created two things on this line: a string with the value main
stored off in memory somewhere, and a reference variable called strMain
pointing to it.
DoSomething(strMain);
Now we pass that reference to DoSomething
. We've passed it by value, so that means we made a copy. It's a reference type, so that means we copied the reference, not the string itself. Now we have two references that each point to the same value in memory.
Inside the callee
Here's the top of the DoSomething
method:
void DoSomething(string strLocal)
No ref
keyword, so strLocal
and strMain
are two different references pointing at the same value. If we reassign strLocal
...
strLocal = "local";
...we haven't changed the stored value; we took the reference called strLocal
and aimed it at a brand new string. What happens to strMain
when we do that? Nothing. It's still pointing at the old string.
string strMain = "main"; // Store a string, create a reference to it
DoSomething(strMain); // Reference gets copied, copy gets re-pointed
Console.WriteLine(strMain); // The original string is still "main"
Immutability
Let's change the scenario for a second. Imagine we aren't working with strings, but some mutable reference type, like a class you've created.
class MutableThing
{
public int ChangeMe { get; set; }
}
If you follow the reference objLocal
to the object it points to, you can change its properties:
void DoSomething(MutableThing objLocal)
{
objLocal.ChangeMe = 0;
}
There's still only one MutableThing
in memory, and both the copied reference and the original reference still point to it. The properties of the MutableThing
itself have changed:
void Main()
{
var objMain = new MutableThing();
objMain.ChangeMe = 5;
Console.WriteLine(objMain.ChangeMe); // it's 5 on objMain
DoSomething(objMain); // now it's 0 on objLocal
Console.WriteLine(objMain.ChangeMe); // it's also 0 on objMain
}
Ah, but strings are immutable! There's no ChangeMe
property to set. You can't do strLocal[3] = 'H'
in C# like you could with a C-style char
array; you have to construct a whole new string instead. The only way to change strLocal
is to point the reference at another string, and that means nothing you do to strLocal
can affect strMain
. The value is immutable, and the reference is a copy.
Passing a reference by reference
To prove there's a difference, here's what happens when you pass a reference by reference:
void DoSomethingByReference(ref string strLocal)
{
strLocal = "local";
}
void Main()
{
string strMain = "main";
DoSomethingByReference(ref strMain);
Console.WriteLine(strMain); // Prints "local"
}
This time, the string in Main
really does get changed because you passed the reference without copying it on the stack.
So even though strings are reference types, passing them by value means whatever goes on in the callee won't affect the string in the caller. But since they are reference types, you don't have to copy the entire string in memory when you want to pass it around.
Further resources:
- Here is the best article I've read on the difference between reference types and value types in C#, and why a reference type isn't the same as a reference-passed parameter.
- As usual, Eric Lippert also has several excellent blog posts on the subject.
- He has some great stuff on immutability, too.
ref
keyword. To prove that passing by reference makes a difference, see this demo: rextester.com/WKBG5978 –
Wulfe ref
keyword has utility, I was just trying to explain why one might think of passing a reference type by value in C# seems like the "traditional" (i.e. C) notion of passing by reference (and passing a reference type by reference in C# seems more like passing a reference to a reference by value). –
Formaldehyde Foo(string bar)
could be thought of as Foo(char* bar)
whereas Foo(ref string bar)
would be Foo(char** bar)
(or Foo(char*& bar)
or Foo(string& bar)
in C++). Sure, it's not how you should think of it everyday, but it actually helped me finally understand what is happening under the hood. –
Spandex string
and any other reference type. I can't find any special in the specification or in Lippert's blog about passing it. As stated by Lippert , there is 3rd kind of value - references. "We see that references and instances of value types are essentially the same thing as far as their storage is concerned; they go on either the stack, in registers, or the heap depending on whether the storage of the value needs to be short-lived or long-lived." –
Melodramatize ref
, out
etc. –
Melodramatize Strings in C# are immutable reference objects. This means that references to them are passed around (by value), and once a string is created, you cannot modify it. Methods that produce modified versions of the string (substrings, trimmed versions, etc.) create modified copies of the original string.
Strings are special cases. Each instance is immutable. When you change the value of a string you are allocating a new string in memory.
So only the reference is passed to your function, but when the string is edited it becomes a new instance and doesn't modify the old instance.
StringBuilder
is a mutable string class that allows fast modification of the string being built without allocating new strings in memory for each modification. –
Anew Uri
(class) and Guid
(struct) are also special cases. I do not see how System.String
acts like a "value type" any more than other immutable types... of either class or struct origins. –
Polypropylene Uri
& Guid
- you can just assign a string-literal value to a string variable. The string appears to be mutable, like an int
being reassigned, but it's creating an object implicitly - no new
keyword. –
Anew ==
comparing them by value) can be easily replicated in user-defined types. I would not describe them as behaving like value types. –
Wulfe © 2022 - 2024 — McMap. All rights reserved.