What is the type of string literals in C and C++?
Asked Answered
F

4

74

What is the type of string literal in C? Is it char * or const char * or const char * const?

What about C++?

Fairleigh answered 11/2, 2010 at 15:54 Comment(0)
W
73

In C the type of a string literal is a char[] - it's not const according to the type, but it is undefined behavior to modify the contents. Also, 2 different string literals that have the same content (or enough of the same content) might or might not share the same array elements.

From the C99 standard 6.4.5/5 "String Literals - Semantics":

In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence; for wide string literals, the array elements have type wchar_t, and are initialized with the sequence of wide characters...

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

In C++, "An ordinary string literal has type 'array of n const char'" (from 2.13.4/1 "String literals"). But there's a special case in the C++ standard that makes pointer to string literals convert easily to non-const-qualified pointers (4.2/2 "Array-to-pointer conversion"):

A string literal (2.13.4) that is not a wide string literal can be converted to an rvalue of type “pointer to char”; a wide string literal can be converted to an rvalue of type “pointer to wchar_t”.

As a side note - because arrays in C/C++ convert so readily to pointers, a string literal can often be used in a pointer context, much as any array in C/C++.


Additional editorializing: what follows is really mostly speculation on my part about the rationale for the choices the C and C++ standards made regarding string literal types. So take it with a grain of salt (but please comment if you have corrections or additional details):

I think that the C standard chose to make string literal non-const types because there was (and is) so much code that expects to be able to use non-const-qualified char pointers that point to literals. When the const qualifier got added (which if I'm not mistaken was done around ANSI standardization time, but long after K&R C had been around to accumulate a ton of existing code) if they made pointers to string literals only able to be be assigned to char const* types without a cast nearly every program in existence would have required changing. Not a good way to get a standard accepted...

I believe the change to C++ that string literals are const qualified was done mainly to support allowing a literal string to more appropriately match an overload that takes a "char const*" argument. I think that there was also a desire to close a perceived hole in the type system, but the hole was largely opened back up by the special case in array-to-pointer conversions.

Annex D of the standard indicates that the "implicit conversion from const to non-const qualification for string literals (4.2) is deprecated", but I think so much code would still break that it'll be a long time before compiler implementers or the standards committee are willing to actually pull the plug (unless some other clever technique can be devised - but then the hole would be back, wouldn't it?).

Washtub answered 11/2, 2010 at 16:41 Comment(0)
S
11

A C string literal has type char [n] where n equals number of characters + 1 to account for the implicit zero at the end of the string.

The array will be statically allocated; it is not const, but modifying it is undefined behaviour.

If it had pointer type char * or incomplete type char [], sizeof could not work as expected.

Making string literals const is a C++ idiom and not part of any C standard.

Sigil answered 11/2, 2010 at 16:42 Comment(0)
S
2

For various historical reasons, string literals were always of type char[] in C.

Early on (in C90), it was stated that modifying a string literal invokes undefined behavior.

They didn't ban such modifications though, nor did they make string literals const char[] which would have made more sense. This was for backwards-compatibility reasons with old code. Some old OS (most notably DOS) didn't protest if you modified string literals, so there was plenty of such code around.

C still has this defect today, even in the most recent C standard.

C++ inherited the same very same defect from C, but in later C++ standards, they have finally made string literals const (flagged obsolete in C++03, finally fixed in C++11).

Severson answered 26/4, 2016 at 7:9 Comment(0)
I
0

They used to be of type char[]. Now they are of type const char[].

Immaculate answered 11/2, 2010 at 15:58 Comment(7)
+1 Pointer .. to what? WHERE?? Ohhh, you say the COMPILER does that magic for me.Endocardial
If you declare a variable char[] and set it to a string literal, you will get a copy of that literal, which can be modified.Baese
just to note, after this change a standard conversion from const char[] to char[] was introduced. This was to avoid breaking all the existing code that had functions defined like "int foo(char*)Ablaze
@JayConrod char x[]="abc"; is a special case which declares a variable of type char[4] and initializes it as specified. It's a shorthand for char x[4] = {0x61,0x62,0x63,0};, and so that string literal isn't treated as others are. In particular, it won't get placed in some unnamed location by the compiler, as most string literals are.Pops
Used to be when? When did it change?Urinal
You don’t say if you mean C, but I presume so. The standards of C99 and 2011 seem to contradict what you say: C99 6.4.5, §6 says “If the program attempts to modify such an array, the behaviour is undefined”; §7 of C2011 says the same. Of course sane compilers (or those that care for your sanity) will flag this for you if you want.Outbrave
@Outbrave I think the answerer means C++ (see all other answers).Kutzer

© 2022 - 2024 — McMap. All rights reserved.