How to get the string size in bytes?
Asked Answered
E

6

34

As the title implies, my question is how to get the size of a string in C. Is it good to use sizeof if I've declared it (the string) in a function without malloc in it? Or, if I've declared it as a pointer? What if I initialized it with malloc? I would like to have an exhaustive response.

Eskill answered 21/2, 2013 at 11:0 Comment(0)
B
53

You can use strlen. Size is determined by the terminating null-character, so passed string should be valid.

If you want to get size of memory buffer, that contains your string, and you have pointer to it:

  • If it is dynamic array(created with malloc), it is impossible to get it size, since compiler doesn't know what pointer is pointing at. (check this)
  • If it is static array, you can use sizeof to get its size.

If you are confused about difference between dynamic and static arrays, check this.

Basia answered 21/2, 2013 at 11:4 Comment(10)
Actually size is strlen()+1 (1 for terminating character)Cheep
@0x69 The size of a string is typically defined as excluding the null-terminator.Lundeen
do char type in c have always a byte size?Eskill
@Eskill , yes. C99 standart: section 6.5.3.4: When applied to an operand that has type char, unsigned char, or signed char, (or a qualified version thereof) the result is 1.Basia
"If it is dynamic array(created with malloc), it is impossible to get it size, since compiler doesn't know what pointer is pointing at." Wrong. Store the size and you can get the size. Additionally, the compiler knows exactly what pointer is pointing at.Revolute
@modifiablelvalue what do you mean by store the size?Eskill
@modifiablelvalue The implication was - it's impossible to get the size directly from the dynamic array. Of course you can store the size, but that's not the point.Lundeen
@artaxerxe, Dukeling: char (*fubar)[size] = malloc(size); printf("The size of the array pointed to by fubar is %zu\n", sizeof *fubar);Revolute
@artaxerxe: C defines that a char is 1 byte, but it doesn't define that a byte is 8 bits. So, while it's true that char is always a byte as far as C concerned, it isn't necessarily an octet, and much of the world outside the C standard thinks "byte" means "octet". C implementations on which char is larger than 8 bits are in the minority, but include a lot of DSPs. A lot of "portable" code won't run on DSPs anyway for various reasons, so it's for you to decide whether you want to assume that a byte is 8 bits. The value CHAR_BIT tells you for certain.Casaba
@Dukeling Hm... Then copy strlen() bytes with memcpy to other (non-empty) buffer and print it. What you will see will be your new string + garbage. This is argument to me that string size in bytes is strlen()+1. (However string length is strlen()).Cheep
L
20

Use strlen to get the length of a null-terminated string.

sizeof returns the length of the array not the string. If it's a pointer (char *s), not an array (char s[]), it won't work, since it will return the size of the pointer (usually 4 bytes on 32-bit systems). I believe an array will be passed or returned as a pointer, so you'd lose the ability to use sizeof to check the size of the array.

So, only if the string spans the entire array (e.g. char s[] = "stuff"), would using sizeof for a statically defined array return what you want (and be faster as it wouldn't need to loop through to find the null-terminator) (if the last character is a null-terminator, you will need to subtract 1). If it doesn't span the entire array, it won't return what you want.

An alternative to all this is actually storing the size of the string.

Lundeen answered 21/2, 2013 at 11:3 Comment(4)
Note that sizeof doesn't actually give you the length of a string. For a string literal, sizeof includes the null terminator. For an array of char, sizeof gives you the number of elements in the array (which is an unpredictable amount larger than a string length of the array's content)Greensickness
It might make sense to point out that strlen and sizeof return two fundamental different things.Aquila
The size of a pointer is not always 4 bytes on a 32-bit system. Consider if CHAR_BIT is 32-bits. In addition to that, consider if a 16-bit OS and compiler lives on that 32-bit system. CHAR_BIT may still be 32 bits on 16-bit OS and hardware. The size of the pointer is a decision made by the compiler, NOT the OS or hardware. If the compiler chooses to use the same size as the OS or hardware, then that is the compiler's choice. Additionally, different pointers may have different sizes.Revolute
@modifiablelvalue Changed "always" to "usually".Lundeen
Q
12

While sizeof works for this specific type of string:

char str[] = "content";
int charcount = sizeof str - 1; // -1 to exclude terminating '\0'

It does not work if str is pointer (sizeof returns size of pointer, usually 4 or 8) or array with specified length (sizeof will return the byte count matching specified length, which for char type are same).

Just use strlen().

Quetzalcoatl answered 21/2, 2013 at 11:13 Comment(2)
in your example, since str is a type name shouldn't it be parenthesized?Psychologize
@Psychologize str is variable name. It is an array so sizeof str will return the size of entire array in bytes. When array is declared like above, the array size is exactly the size of the string literal including terminating '\0'. And sizeof has higher precedence than -, so sizeof str does not need any parenthesis, though adding them for clarity would not be a bad thing here, I admit.Quetzalcoatl
D
5

If you use sizeof()then a char *str and char str[] will return different answers. char str[] will return the length of the string(including the string terminator) while char *str will return the size of the pointer(differs as per compiler).

Disavow answered 5/7, 2016 at 5:53 Comment(0)
D
2

I like to use:

(strlen(string) + 1 ) * sizeof(char)

This will give you the buffer size in bytes. You can use this with snprintf() may help:

const char* message = "%s, World!";
char* string = (char*)malloc((strlen(message)+1))*sizeof(char));
snprintf(string, (strlen(message)+1))*sizeof(char), message, "Hello");

Cheers! Function: size_t strlen (const char *s)

Dingess answered 30/12, 2015 at 3:42 Comment(6)
(1) strlen() always returns the length in bytes, multiplying it by 1 adds nothing; (2) There may be no relation between the size of the buffer and the length of the string it contains; (3) Have you tried compiling any of that code?Stratfordonavon
(1) Didn't multiply by one (2) I'm allocating bytes for a pointer so i use the string length * sizeof a char (inbytes) (3) I use it all the time.Dingess
if you want to build up a rep, you need to do it by providing good answers. The latest edit at least removes the previous errors, but now it's not even an answer, it's a comment. All the other answers both refer to strlen() and give some meaningful explanation, this just isn't an upvote-worthy answer.Stratfordonavon
I'll happily remove the down vote if you leave a decent answer. If you leave a good answer, I'll happily up vote it. Your original code wouldn't compile on any C compiler, embedded or otherwise.Stratfordonavon
It doesn't contain the offending code from your original answer, so it's irrelevant to the point. 1/2 is always zero. Stop wasting my time with this nonsense.Stratfordonavon
@PaulGriffiths better?Dingess
P
0

There are two ways of finding the string size bytes:

1st Solution:

# include <iostream>
# include <cctype>
# include <cstring>
using namespace std;

int main()
{
    char str[] = {"A lonely day."};
    cout<<"The string bytes for str[] is: "<<strlen(str);
    return 0;
}

2nd Solution:

# include <iostream>
# include <cstring>
using namespace std;

int main()
{
    char str[] = {"A lonely day."};
    cout<<"The string bytes for str[] is: "<<sizeof(str);
    return 0;
}

Both solution produces different outputs. I will explain it to you after you read these.

The 1st solution uses strlen and based on cplusplus.com,

The length of a C string is determined by the terminating null-character: A C string is as long as the number of characters between the beginning of the string and the terminating null character (without including the terminating null character itself).

That can explain why does the 1st Solution prints out the correct string size bytes when the 2nd Solution prints the wrong string size bytes. But if you still don't understand, then continue reading.

The 2nd Solution uses sizeof to find out the string size bytes. Based on this SO answer, it says (modified it):

sizeof("f") must return 2 string size bytes, one for the 'f' and one for the terminating '\0' (terminating null-character).

That is why the output is string size bytes 14. One for the whole string and one for '\0'.


Conclusion:

To get the correct answer for 2nd Solution, you must do sizeof(str)-1.


References:

  1. Sizeof string literal
  2. https://cplusplus.com/reference/cstring/strlen/?kw=strlen
Pseudonymous answered 17/7, 2022 at 11:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.