Clarification of char pointers in C
Asked Answered
G

4

6

I'm working through K&R second edition, chapter 5.

On page 87, pointers to character arrays are introduced as:

char *pmessage;
pmessage = "Now is the time";

How does one know that pmessage is a pointer to a character array, and not a pointer to a single character?

To expand, on page 94, the following function is defined:

/* month_name: return the name of the n-th month */
char *month_name(int n)
{
    static char *name[] = {
        "Illegal month",
        "January", "February", "March",
        ...
    };

    return (n < 1 || n > 12) ? name[0] : name[n];
}

If one were simply provided with the function declaration for the above, how could one know whether a single character or a character array is returned?

If one were to assume the return from month_name() is a character array and iterate over it until a NULL is encountered, but the return was in fact a single character then is there not the potential for a segmentation fault?

Could somebody please demonstrate the declaration and assignment of a pointer to a single character vs a character array, their usage with functions and identification of which has been returned?

Gragg answered 14/3, 2014 at 11:59 Comment(2)
How does one know that pmessage is a pointer to a character array, and not a pointer to a single character? One doesn't. But one knows the function one uses. If one doesn't know what the function one uses returns, how can one expect to create a program that works?Cozmo
You don't. And more importantly you don't know if it's null-terminated or not. Which causes a few problems now and then...!Giesser
A
1

How does one know that pmessage is a pointer to a character array, and not a pointer to a single character?

You don't. At least, there's no way to tell from the pointer value itself whether it points to a single char or the first element of an array of char. It can be used either way.

You have to rely on context or specify explicitly how the pointer is to be used. For example, scanf uses different conversion specifiers in order to determine whether a pointer is pointing to a single char:

char single_char;
scanf( " %c", &single_char );

or an array of char:

char array_of_char[N];
scanf( "%s", &array_of_char[0] );

Remember that when it isn't the operand of the sizeof or unary & operators or a string literal being used to initialize another array in a declaration, an expression of type "N-element array of T" will be converted ("decay") to an expression of "pointer to T", and the value of the expression will be the address of the first element of the array, so that last line could also be written

scanf( "%s", array_of_char );

Because of that conversion rule, anytime you pass an array expression to a function, what the function actually receives is a pointer value. In fact, the function declarations

void foo( char str[N] );

and

void foo( char str[] );

are equivalent to

void foo( char *str );

All three treat str as a pointer to char.

Anselm answered 14/3, 2014 at 15:22 Comment(0)
I
4

So what you have is a string literal which is an array of char with static storage duration:

"Now is the time"

in most contexts an array will decay into a pointer to the first element which is what happens here:

pmessage = "Now is the time";

You need to design and document your interface in such a way that you know what to expect for the input and output. There is no run-time information to tell the nature of what is being pointed to.

For example if we look at the man page of strtok it tells us:

Each call to strtok() returns a pointer to a null-terminated string containing the next token.

and so the programmer knows exactly what to expect and deals with the result accordingly.

In the case where you have a pointer to a single char and instead it like a C style string then you will have undefined behavior because you will be accessing memory out of bounds. A segmentation fault is one possibility but being undefined just means the result is unpredictable.

Impearl answered 14/3, 2014 at 12:6 Comment(0)
N
2

What does

char *pmessage;
pmessage = "Now is the time";  

mean?
char *pmessage; means that you declared pmessage as a pointer to char
pmessage = "Now is the time"; means that pmessage now points to the first character of the string literal Now is the time.

When you return pmessage from a function then a pointer to string literal is returned.
If you will print pmessage with %c specifier then it will print N and if you will print it with %s then it will print the entire string literal.

printf("%c\n", *N);     // 'N' will be printed
printf("%s\n", N);      //  "Now is the time" will be printed 
Nam answered 14/3, 2014 at 12:6 Comment(0)
C
2

Strange as it seems, C trusts the intelligence of the programmer. If I see a function such as:

/* month_name: return the name of the n-th month */
char *month_name(int n)
{
    static char *name[] = {
        "Illegal month",
        "January", "February", "March",
        ...
    };

    return (n < 1 || n > 12) ? name[0] : name[n];
}

I look at the documentation and read that it returns a NUL-terminated string that is a pointer to statically allocated memory. That is enough for me to treat the return value as I should.

If the creator of the function changes the return value in a future release to a different kind of string, they better shout it out loud, change the function name, or make it very clear otherwise it's very bad behavior from their side.

If I on the other hand fail to treat the return value correctly even though it's documented properly, well I wouldn't be so intelligent then and perhaps cut out to be a Java developer.

Lastly, if the function is not documented, find its owner and burn his house.1

1 if it's in a library this is released! Don't burn people's house as soon as they start coding a library! ^_^

Cozmo answered 14/3, 2014 at 12:8 Comment(8)
@ShafikYaghmour, probably by a Java developer ;)Cozmo
Shafik's answer provides a detailed response to the questions posed from both the perspective of somebody writing a function and consuming somebody else's, whereas your answer is as much concerned with demonstrating C elitism as it is with answering the question. His statement, "You need to design and document your interface in such a way that you know what to expect for the input and output. There is no run-time information to tell the nature of what is being pointed to." is key information that your answer lacks. You imply as much, but intimate that seeking clarification suggests inability.Gragg
@retrodev, I don't deny the quality of shafik's answer. But just to be clear, I do not want to at all imply that seeking clarification suggests inability. The fact that C trusts the intelligence of the programmer is not meant to be an insult or anything. It's really the way it is. C lets you open to every little detail of your hardware. That's one reason many people are afraid of it and the same reason it's perfect for system or embedded programming. C assumes that you know what you are doing.Cozmo
Sometimes, somethings are hard to control because of human error and C helps you with that; that's why you have a rather loose type-checking system. However, I have never heard or experienced confusion between pointers to arrays and pointers to a single element (which could be viewed as a one-element array), or in particular confusion between C-strings and pointers to single characters. Since almost no one had this confusion, there were no reason to add extra syntax in C to guard against this kind of mistake.Cozmo
@Shahbaz, It wasn't your opening point I objected to, but "...I wouldn't be so intelligent then and perhaps cut out to be a Java developer." I started my career before C was written and unfortunately this kind of elitism has been present throughout; it's a detriment to the industry. There have been lots of "hard" low-level languages in the past 40 years and they all provide differing features. Therein lies the problem, C behaves this way, but there's no way of knowing that in advance, irrespective of how much experience one has. In this respect, your comments are more useful than your answer.Gragg
To summarise, ignoring documentation has nothing to do with being a Java developer or otherwise, so it's not a useful comment. That's why I down-voted.Gragg
I may have been a little mean in that, but there was a hidden meaning in it. A language that does such checks, such as Java, wants to be sure you as the programmer doesn't mess up, at the cost of runtime overhead. That's why arguably, writing Java code requires less ... attention. If you make such a mistake in Java or Scala for example, you may quickly get an exception if not a compile error. So if you tend to not carefully read documentation, chances are just fixing your compile errors make things work.Cozmo
Which is by the way an interesting property! But then again, C trusts you are attentive and doesn't impose such overhead to compensate for your lack of attention.Cozmo
A
1

How does one know that pmessage is a pointer to a character array, and not a pointer to a single character?

You don't. At least, there's no way to tell from the pointer value itself whether it points to a single char or the first element of an array of char. It can be used either way.

You have to rely on context or specify explicitly how the pointer is to be used. For example, scanf uses different conversion specifiers in order to determine whether a pointer is pointing to a single char:

char single_char;
scanf( " %c", &single_char );

or an array of char:

char array_of_char[N];
scanf( "%s", &array_of_char[0] );

Remember that when it isn't the operand of the sizeof or unary & operators or a string literal being used to initialize another array in a declaration, an expression of type "N-element array of T" will be converted ("decay") to an expression of "pointer to T", and the value of the expression will be the address of the first element of the array, so that last line could also be written

scanf( "%s", array_of_char );

Because of that conversion rule, anytime you pass an array expression to a function, what the function actually receives is a pointer value. In fact, the function declarations

void foo( char str[N] );

and

void foo( char str[] );

are equivalent to

void foo( char *str );

All three treat str as a pointer to char.

Anselm answered 14/3, 2014 at 15:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.