Why are C#/.NET strings length-prefixed and null terminated?
Asked Answered
V

5

21

After reading What's the rationale for null terminated strings? and some similar questions I have found that in C#/.NET strings are, internally, both length-prefixed and null terminated like in BSTR Data Type.

What is the reason strings are both length-prefixed and null terminated instead of eg. only length-prefixed?

Varmint answered 9/6, 2011 at 13:24 Comment(1)
Probably only @Eric Lippert is going to be able to answer this one. There's good reasons for doing one or the other (and trade-offs as well). I'm as surprised as you that C# does both.Supranatural
L
21

Length prefixed so that computing length is O(1).

Null terminated to make marshaling to unmanaged blazing fast (unmanaged likely expects null-terminated strings).

Lankester answered 9/6, 2011 at 13:35 Comment(0)
H
12

Here is an excerpt from Jon Skeet's Blog Post about strings:

Although strings aren't null-terminated as far as the API is concerned, the character array is null-terminated, as this means it can be passed directly to unmanaged functions without any copying being involved, assuming the inter-op specifies that the string should be marshalled as Unicode.

Hoecake answered 9/6, 2011 at 13:33 Comment(0)
G
4

Most likely, to ensure easy interoperability with COM.

Gauzy answered 9/6, 2011 at 13:31 Comment(0)
A
3

While the length field makes it easy for the framework to determine the length of a string (and it lets string contain characters with a zero value), there's an awful lot of stuff that the framework (or user programs) need to deal with that expect NULL terminated strings.

Like the Win32 API, for example.

So it's convenient to keep a NULL terminator on at the end of the string data because it's likely going to need to be there quite often anyway.

Note that C++'s std::string class is implemented the same way (in MSVC anyway). For the same reason, I'm sure (c_str() is often used to pass a std::string to something that wants a C-style string).

Autopsy answered 9/6, 2011 at 13:44 Comment(0)
T
1

Best guess is that finding the length is constant (O(1)) compared to traversing it, running in O(n).

Teresaterese answered 9/6, 2011 at 13:34 Comment(5)
That's the reasoning behind prefixing the string with the length. That's not a reason for additionally using a termination characterGauzy
@Daniel Hilgarth: And why I did not duplicate the other answers. The question asks the reasoning from both sides.Teresaterese
Sorry, I don't understand your comment - come again? The questions asks what is the reasoning to use both together. And not what the reasoning is for one or the other on its ownGauzy
You're right, but I think the question asks why both are used concurrently. Really only one or other other is required to determine string length.Supranatural
Yes, I am wondering why both are used together and concurrently, and not only one of them (specifically - length-prefixed).Varmint

© 2022 - 2024 — McMap. All rights reserved.