How can I concatenate twice with the C preprocessor and expand a macro as in "arg ## _ ## MACRO"?
Asked Answered
V

3

183

I am trying to write a program where the names of some functions are dependent on the value of a certain macro variable with a macro like this:

#define VARIABLE 3
#define NAME(fun) fun ## _ ## VARIABLE

int NAME(some_function)(int a);

Unfortunately, the macro NAME() turns that into

int some_function_VARIABLE(int a);

rather than

int some_function_3(int a);

so this is clearly the wrong way to go about it. Fortunately, the number of different possible values for VARIABLE is small, so I can simply do an #if VARIABLE == n and list all the cases separately, but is there is a clever way to do it?

Venture answered 29/9, 2009 at 0:3 Comment(5)
Are you sure you don't want to use function pointers instead?Georgena
@Jurily - Function pointers work at runtime, preprocessor works at (before) compile time. There is a difference, even if both can be used for the same task.Executory
The point is that what it is used in is a fast computational geometry library.. which is hardwired for a certain dimension. However, sometimes someone would want to be able to use it with a few different dimensions (say, 2 and 3) and so one would need an easy way to generate code with dimension-dependent function and type names. Also, the code is written in ANSI C so the funky C++ stuff with templates and specialization is not applicable here.Venture
Voting to reopen because this question is specific about recursive macro expansion and https://mcmap.net/q/24832/-what-are-the-applications-of-the-preprocessor-operator-and-gotchas-to-consider is a generic "what it is good for". The title of this question should be made more precise.Matter
I wish this example had been minimized: the same happens on #define A 0 \n #define M a ## A: having two ## is not the key.Matter
C
257

Standard C Preprocessor

$ cat xx.c
#define VARIABLE 3
#define PASTER(x,y) x ## _ ## y
#define EVALUATOR(x,y)  PASTER(x,y)
#define NAME(fun) EVALUATOR(fun, VARIABLE)

extern void NAME(mine)(char *x);
$ gcc -E xx.c
# 1 "xx.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "xx.c"





extern void mine_3(char *x);
$

Two levels of indirection

In a comment to another answer, Cade Roux asked why this needs two levels of indirection. The flippant answer is because that's how the standard requires it to work; you tend to find you need the equivalent trick with the stringizing operator too.

Section 6.10.3 of the C99 standard covers 'macro replacement', and 6.10.3.1 covers 'argument substitution'.

After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. A parameter in the replacement list, unless preceded by a # or ## preprocessing token or followed by a ## preprocessing token (see below), is replaced by the corresponding argument after all macros contained therein have been expanded. Before being substituted, each argument’s preprocessing tokens are completely macro replaced as if they formed the rest of the preprocessing file; no other preprocessing tokens are available.

In the invocation NAME(mine), the argument is 'mine'; it is fully expanded to 'mine'; it is then substituted into the replacement string:

EVALUATOR(mine, VARIABLE)

Now the macro EVALUATOR is discovered, and the arguments are isolated as 'mine' and 'VARIABLE'; the latter is then fully expanded to '3', and substituted into the replacement string:

PASTER(mine, 3)

The operation of this is covered by other rules (6.10.3.3 'The ## operator'):

If, in the replacement list of a function-like macro, a parameter is immediately preceded or followed by a ## preprocessing token, the parameter is replaced by the corresponding argument’s preprocessing token sequence; [...]

For both object-like and function-like macro invocations, before the replacement list is reexamined for more macro names to replace, each instance of a ## preprocessing token in the replacement list (not from an argument) is deleted and the preceding preprocessing token is concatenated with the following preprocessing token.

So, the replacement list contains x followed by ## and also ## followed by y; so we have:

mine ## _ ## 3

and eliminating the ## tokens and concatenating the tokens on either side combines 'mine' with '_' and '3' to yield:

mine_3

This is the desired result.


If we look at the original question, the code was (adapted to use 'mine' instead of 'some_function'):

#define VARIABLE 3
#define NAME(fun) fun ## _ ## VARIABLE

NAME(mine)

The argument to NAME is clearly 'mine' and that is fully expanded.
Following the rules of 6.10.3.3, we find:

mine ## _ ## VARIABLE

which, when the ## operators are eliminated, maps to:

mine_VARIABLE

exactly as reported in the question.


Traditional C Preprocessor

Robert Rüger asks:

Is there any way do to this with the traditional C preprocessor which does not have the token pasting operator ##?

Maybe, and maybe not — it depends on the preprocessor. One of the advantages of the standard preprocessor is that it has this facility which works reliably, whereas there were different implementations for pre-standard preprocessors. One requirement is that when the preprocessor replaces a comment, it does not generate a space as the ANSI preprocessor is required to do. The GCC (6.3.0) C Preprocessor meets this requirement; the Clang preprocessor from XCode 8.2.1 does not.

When it works, this does the job (x-paste.c):

#define VARIABLE 3
#define PASTE2(x,y) x/**/y
#define EVALUATOR(x,y) PASTE2(PASTE2(x,_),y)
#define NAME(fun) EVALUATOR(fun,VARIABLE)

extern void NAME(mine)(char *x);

Note that there isn't a space between fun, and VARIABLE — that is important because if present, it is copied to the output, and you end up with mine_ 3 as the name, which is not syntactically valid, of course. (Now, please can I have my hair back?)

With GCC 6.3.0 (running cpp -traditional x-paste.c), I get:

# 1 "x-paste.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "x-paste.c"





extern void mine_3(char *x);

With Clang from XCode 8.2.1, I get:

# 1 "x-paste.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 329 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "x-paste.c" 2





extern void mine _ 3(char *x);

Those spaces spoil everything. I note that both preprocessors are correct; different pre-standard preprocessors exhibited both behaviours, which made token pasting an extremely annoying and unreliable process when trying to port code. The standard with the ## notation radically simplifies that.

There might be other ways to do this. However, this does not work:

#define VARIABLE 3
#define PASTER(x,y) x/**/_/**/y
#define EVALUATOR(x,y) PASTER(x,y)
#define NAME(fun) EVALUATOR(fun,VARIABLE)

extern void NAME(mine)(char *x);

GCC generates:

# 1 "x-paste.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "x-paste.c"





extern void mine_VARIABLE(char *x);

Close, but no dice. YMMV, of course, depending on the pre-standard preprocessor that you're using. Frankly, if you're stuck with a preprocessor that is not cooperating, it would probably be simpler to arrange to use a standard C preprocessor in place of the pre-standard one (there is usually a way to configure the compiler appropriately) than to spend much time trying to work out a way to do the job.

Champaigne answered 29/9, 2009 at 0:23 Comment(4)
Yep, this solves the problem. I knew the trick with two levels of recursion -- I had to play with stringification at least once -- but didn't know how to do this one.Venture
Is there any way do to this with the traditional C preprocessor which does not have the token pasting operator ##?Elfland
@RobertRüger: it doubles the length of the answer, but I've added information to cover cpp -traditional. Note that there isn't a definitive answer — it depends on the preprocessor you've got.Champaigne
Thank you very much for the answer. This is totally great! In the meantime I also found another, slightly different solution. See here. It also has the problem that it doesn't work with clang though. Luckily that's not an issue for my application ...Elfland
B
36

Use:

#define VARIABLE 3
#define NAME2(fun,suffix) fun ## _ ## suffix
#define NAME1(fun,suffix) NAME2(fun,suffix)
#define NAME(fun) NAME1(fun,VARIABLE)

int NAME(some_function)(int a);

Honestly, you don't want to know why this works. If you know why it works, you'll become that guy at work who knows this sort of thing, and everyone will come ask you questions. =)

Biquadratic answered 29/9, 2009 at 0:17 Comment(2)
Could you explain why it needs two levels of indirection. I had an answer with one level of redirection but I deleted the answer because I had to install C++ into my Visual Studio and then it wouldn't work.Before
I want to become that guy at work who knows this sort of thing. :)Detribalize
M
10

Plain-English explanation of the EVALUATOR two-step pattern

I haven't fully understood every word of the C standard, but I think this is a reasonable working model for how the solution shown in Jonathan Leffler's answer works, explained a little more verbosely. Let me know if my understanding is incorrect, hopefully with a minimal example that breaks my theory.

For our purposes, we can think of macro expansion as happening in three steps:

  1. (prescan) Macro arguments are replaced:
    • if they are part of concatenation (A ## B) or stringification (#A), they are replaced exactly as the string given on the macro call, without being expanded
    • otherwise, they are first fully expanded, and only then replaced
  2. Stringification and concatenation happen
  3. All defined macros are expanded, including macros generated in stringification

Step-by-step example without indirection

main.c

#define CAT(x) pref_ ## x
#define Y a

CAT(Y)

and expand it with:

gcc -E main.c

we get:

pref_Y

because:

Step 1: Y is a the macro argument of CAT.

x appears in a stringification pref_ ## x. Therefore, Y gets pasted as is without expansion giving:

pref_ ## Y

Step 2: concatenation happens and we are left with:

pref_Y

Step 3: any further macro replacement happens. But pref_Y is not any known macro, so it is left alone.

We can confirm this theory by actually adding a definition to pref_Y:

#define CAT(x) pref_ ## x
#define Y a
#define pref_Y asdf

CAT(Y)

and now the result would be:

asdf

because on Step 3 above pref_Y is now defined as a macro, and therefore expands.

Step-by-step example with indirection

If we use the two step pattern however:

#define CAT2(x) pref_ ## x
#define CAT(x) CAT2(x)
#define Y a

CAT(Y)

we get:

pref_a

Step 1: CAT is evaluated.

CAT(x) is defined as CAT2(x), so argument x of CAT at the definition does not appear in a stringification: the stringification only happens after CAT2 is expanded, which is not seen in this step.

Therefore, Y is fully expanded before being replaced, going through steps 1, 2, and 3, which we omit here because it trivially expands to a. So we put a in CAT2(x) giving:

CAT2(a)

Step 2: there is no stringification to be done

Step 3: expand all existing macros. We have the macro CAT2(a) and so we go on to expand that.

Step 3.1: the argument x of CAT2 appears in a stringification pref_ ## x. Therefore, paste the input string a as is, giving:

pref_ ## a

Step 3.2: stringify:

pref_a

Step 3: expand any further macros. pref_a is not any macro, so we are done.

GCC argument prescan documentation

GCC's documentation on the matter is also worth a read: https://gcc.gnu.org/onlinedocs/cpp/Argument-Prescan.html

Bonus: how those rules prevent nested calls from going infinite

Now consider:

#define f(x) (x + 1)

f(f(a))

which expands to:

((a + 1) + 1)

instead of going infinite.

Let's break it down:

Step 1: the outer f is called with argument x = f(a).

In the definition of f, the argument x is not part of a concatenation in the definition (x + 1) of f. Therefore it is first fully expanded before being replaced.

Step 1.1.: we fully expand the argument x = f(1) according to steps 1, 2, and 3, giving x = (a + 1).

Now back in Step 1, we take that fully expanded x argument equaling (a + 1), and put it inside the definition of f giving:

((a + 1) + 1)

Steps 2 and 3: not much happens, because we have no stringification and no more macros to expand.

Matter answered 3/11, 2020 at 11:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.