Is it appropriate to set a value to a "const char *" in the header file
Asked Answered
D

3

18

I have seen people using 2 methods to declare and define char *.

Medhod 1: The header file has the below

extern const char* COUNTRY_NAME_USA = "USA";

Medhod 2:
The header file has the below declaration:

extern const char* COUNTRY_NAME_USA;

The cpp file has the below definition:

extern const char* COUNTRY_NAME_USA = "USA";
  1. Is method 1 wrong in some way ?
  2. What is the difference between the two ?
  3. I understand the difference between "const char * const var" , and "const char * var". If in the above methods if a "const char * const var" is declared and defined in the header as in method 1 will it make sense ?
Dim answered 21/5, 2010 at 4:23 Comment(0)
N
36

The first method is indeed wrong, since it makes a definition of an object COUNTRY_NAME_USA with external linkage in the header file. Once that header file gets included into more than one translation unit, the One Definition Rule (ODR) gets violated. The code will fail to compile (more precisely, it will fail to link).

The second method is the correct one. The keyword extern is optional in the definition though, i.e. in the cpp file you can just do

const char* COUNTRY_NAME_USA = "USA"

assuming the declaration from the header file precedes this definition in this translation unit.

Also, I'd guess that since the object name is capitalized, it is probably intended to be a constant. If so, then it should be declared/defined as const char* const COUNTRY_NAME_USA (note the extra const).

Finally, taking that last detail into account, you can just define your constant as

const char* const COUNTRY_NAME_USA = "USA"; // no `extern`!

in the header file. Since it is a constant now, it has internal linkage by default, meaning that there is no ODR violation even if the header file is included into several translation units. In this case you get a separate COUNTRY_NAME_USA lvalue in each translation unit (while in extern method you get one for the entire program). Only you know what you need in your case .

Narcosynthesis answered 21/5, 2010 at 5:59 Comment(6)
But what if i have the below to include the header file only once, will i still have linker time errors : #ifndef MY_HDR_FILE_H_ #define MY_HDR_FILE_H_ #endif , if i dont have 2 const in my declarationDim
@sud: I'm talking about including the header file into different translation units. Your #ifndef guards have absolutely nothing to do with it, cannot prevent this from happening and will make no difference whatsoever. You will still get the same linker errors with the first variant.Narcosynthesis
@Andrey : Thanks a lot for the clarification. In your comment you have mentioned that having "const char* const COUNTRY_NAME_USA = "USA"; " in a hdr file will be the best but, but each translation unit will get its own lvalue. Does it mean that at runtime there there will be efficieny issues. I mean with the executable size be bigger since each transalation unit has its own copy ? Also while debugging will i be able to see the address of the char pointer in the core dump.Dim
@sud: If you declare your COUNTRY_NAME_USA as a constant (the last example), then technically each translation unit will get its own copy of COUNTRY_NAME_USA pointer, but not necessarily its own copy of "USA" literal. This depends on the implementation, but normally I'd expect: 1) the entire program gets only one "USA" literal, 2) the COUNTRY_NAME_USA pointer gets optimized away (since it is a constant). So, there shouldn't be any performance issues. Virtually always it is perfectly OK to define small constants (pointers, integers etc.) in the header files.Narcosynthesis
With a linker that does constant folding (most do if you enable optimization), only one copy of the string's character data will end up in the final executable.Dodd
@Andre : In the your suggestion "const char* const COUNTRY_NAME_USA = "USA"; you have mentioned there will be multiple lvalue based on as many translation units. Are you saying that there will be multiple memory locations for COUNTRY_NAME_USA at runtime ? If so , then if we look at core dumps how can we know which char * i am using ? That could mean that we are having unnecessary char* based on how many cpp files are including this hdr file. In that case i could just go with a #define.Dim
D
11

What's the point?

If you want to lookup strings (that could be localized), this would be best:

namespace CountryNames {
    const char* const US = "USA";
};

Since the pointer is const, it now has internal linkage and won't cause multiple definitions. Most linkers will also combine redundant constants, so you won't waste space in the executable.

If you want to compare strings by pointer equality though, the above isn't portable because the pointers will only be equal if the linker performs the constant-folding optimization. In that case declaring an extern pointer in the header file is the way to go (and it again should be const if you don't intend to retarget it).

Dodd answered 21/5, 2010 at 4:38 Comment(5)
Sorry i missed adding the 'extern' in my Question at the beginning in all my code above. I intended to add extern but i forgot. Now after i added extern, which way should i go ? method-1 or method-2Dim
Thanks a lot for the explanation. If i use "const char* const US = "USA"; in my hdr file. If my linker does not combine redundant constants, at run time will i be able to debug using the core dump with the address of the pointer 'US' ?Dim
Yes, any good debugger will be able to show the target of pointers into the constant data area. Note that all of these options use a string literal so the actual data is stored in the same place for all of them, so debugging works the same way with all of them.Dodd
from you answers you have cleared my brain that the rvalue (which holds string "USA") will be mostly stored in one place. But my concern is about lvalue. Will there be multiple memory locations for this ptr named 'US', if multiple translation units are including this hdr ? In that case we are wasting memory. If i have 100 cpp files i will have 100 X 4 bytes of memory wasted in the RAM. Than should i go for a #define instead ?Dim
Unless you take the address of the pointer, the compiler can simply substitute its value (basically, the offset into the constant data area, this will be adjusted during linking into a true pointer/address) everywhere it appears. In this case there won't be any storage allocated for the pointer at all.Dodd
M
5

If you must have global variables, normal practice is to declare them in a .h file and define them in one (and only one) .cpp file.

In a .h file;

extern int x;

In a .cpp file;

int x=3;

I have used int (the most fundamental basic type perhaps?) rather than const char * as in your example because the essence of your problem doesn't depend on the type of variable.

The basic idea is that you can declare a variable multiple times, so each .cpp file that includes the .h file declares the variable, and that is fine. But you only define it once. The definition is the statement where you assign the variables initial value, (with an =). You don't want definitions in .h files, because then if the .h file is included by multiple .cpp files, you'll get multiple definitions. If you have multiple definitions of one variable, there is a problem at link time because the linker wants to assign the address of the variable and cannot reasonably do that if there are multiple copies of it.

Additional information added later to try and ease Sud's confusion;

Try to reduce your problem to it's minimal parts to understand it better;

Imagine you have a program that comprises three .cpp files. To build the program each .cpp is compiled separately to create three object files, then the three object files are linked together. If the three .cpp files are as follows (example A, good organization);

file1.cpp

extern int x;

file2.cpp

extern int x;

file3.cpp

extern int x;

Then the files will compile and link together without problem (at least as far as the variable x is concerned). There is no problem because each file is only declaring variable x. A declaration is simply stating that there is a variable out there somewhere that I may (or may not) use.

A better way of achieving the same thing is the following (example A, better organization);

header.h

extern int x;

file1.cpp

#include "header.h"

file2.cpp

#include "header.h"

file3.cpp

#include "header.h"

This is effectively exactly the same, for each of the three compilations the compiler sees the same text as earlier as it processes the .cpp file (or translation unit as the experts call it), because the #include directive simply pulls text from another file. Nevertheless this is an improvement on the earlier example simply because we only have our declaration in one file, not in multiple files.

Now consider another working example (example B, good organization);

file1.cpp

extern int x;

file2.cpp

extern int x;

file3.cpp

extern int x;
int x=3;

This will work fine as well. All three .cpp files declare x and one actually defines it. We could go ahead and add more code within functions in any of the three files that manipulates variable x and we wouldn't get any errors. Again we should use a header file so that the declaration only goes into one physical file (example B, better organization).

header.h

extern int x;

file1.cpp

#include "header.h"

file2.cpp

#include "header.h"

file3.cpp

#include "header.h"
int x=3;

Finally consider an example that just wouldn't work (example C, doesn't work);

file1.cpp

int x=3;

file2.cpp

int x=3;

file3.cpp

int x=3;

Each file would compile without problems. The problem occurs at link time because now we have defined three separate int x variables. The have the same name and are all globally visible. The linker's job is to pull all the objects required for a single program into one executable. Globally visible objects must have a unique name, so that the linker can put a single copy of the object at one defined address (place) in the executable and allow all the other objects to access it at that address. The linker cannot do it's job with global variable x in this case and so will choke out an error instead.

As an aside giving the different definitions different initial values doesn't address the problem. Preceding each definition with the keyword static does address the problem because now the variables are not globally visible, but rather visible within the .cpp file that the are defined in.

If you put a global variable definition into a header file, nothing essential has changed (example C, header organization not helpful in this case);

header.h

int x=3;  // Don't put this in a .h file, causes multiple definition link error

file1.cpp

#include "header.h"

file2.cpp

#include "header.h"

file3.cpp

#include "header.h"

Phew, I hope someone reads this and gets some benefit from it. Sometimes the questioner is crying out for a simple explanation in terms of basic concepts not an advanced computer scientist's explanation.

Molder answered 21/5, 2010 at 5:8 Comment(6)
But what if i have the below to include the header file only once, will i still have linker time errors : #ifndef MY_HDR_FILE_H_ #define MY_HDR_FILE_H_ #endifDim
Include guards prevent the header file from being parsed more than once per compilation unit. Distinct .cpp files will still process the header file multiple times.Dodd
@Dim I sense you are a little confused about the role of header files, and how compilation and linking actually works. So I have added more explanation of these important basic ideas in the hope that this will help you understand your problem and its solution better.Molder
@Bill : Yes i did read it, and i am amazed that a person who does not even know me has written such a long explanation to make me understand. Thanks a lot for clearing lot of my technical doubts.Dim
@Dim No problem, you're welcome, that's what stackoverflow is all about. If you have not used all your upvote quota, you can acknowledge my help best by upvoting my answer. An upvote simply acknowledges the answer is helpful, you can upvote multiple answers to one question!Molder
IMHO this answer is a positive role model for other SO answers. Very often the question is asked at one level of experience, and the answers assume a much higher level. Answers pitched at the wrong level are correct, but incomplete.Ephebe

© 2022 - 2024 — McMap. All rights reserved.