/* ... */
It’s nice to make EPOLL_EVENTS
an enum, since doing so
aids in debugging and is more elegant. But it’s also nice for programs
to be able to check for the availability of a particular flag with
#ifdef EPOLLPRI
. So the <sys/epoll.h>
header solves both problems, taking advantage of the fact that cpp
mostly doesn’t expand macros recursively. After these definitions, the
token EPOLLIN
will expand to itself once and then stop
expanding, so it’s effectively equivalent to an enum that also supports
#ifdef
.
To prevent recursion, cpp associates a bit with every macro that has
been defined. The bit reflects whether the macro is currently being
replaced with its substitution list, so let’s call it the
replacing bit. Cpp furthermore associates a bit with each token
in the input stream, signifying that the token can never be
macro-expanded. Let’s call the latter bit the unavailable bit.
Initially, the replacing and unavailable bits are all clear.
As cpp processes each input token T
, it sets
T
’s unavailable bit and decides whether or not to
macro-expand T
as follows:
If T
is the name of a macro for which the
replacing bit is true, cpp sets the unavailable bit on
token T
. Note that even if T
is not in a
context where it could be macro-expanded—because it’s a function-like
macro not followed by “(
”—cpp still sets the unavailable
bit. Moreover, once the unavailable has been set on an input token, it
is never be cleared.
If T
is the name of an object-like macro and
T
’s unavailable bit is clear, then T
is expanded.
If T
is the name of a function-like macro,
T
’s unavailable bit is clear, and T
is followed by (
, then T
is expanded. Note,
however, that if T
is called with an invalid number of
arguments, then the program is ill-formed.
If cpp decides not to macro-expand T
, it simply adds
T
to the current output token list. Otherwise, it expands
T
in two phases.
When T
is a function-like macro, cpp scans all of
the arguments supplied to T
and performs macro expansion on
them. It scans arguments the same as normal token processing, but
instead of placing output tokens in the main preprocessor output, it
builds a replacement token list for each of T
’s arguments.
It also remembers the original, non-macro-expanded arguments for use
with #
and ##
.
Cpp takes T
’s substitution list and, if
T
had arguments, replaces any occurrences of parameter
names with the corresponding argument token lists computed in step 1. It
also performs stringification and pasting as indicated by #
and ##
in the substitution list. It then logically prepends
the resulting tokens to the input list. Finally, cpp sets the replacing
bit to true on the macro named T
.
With the replacing bit true, cpp continues processing input as usual
from the tokens it just added to the input list. This may result in more
macro expansions, so is sometimes called the rescan phase. Once
cpp has consumed all tokens generated from the substitution list, it
clears the replacing bit on the macro named T
.
Let’s look at a simple example:
In phase 1, the argument of the outer macro, namely
“FL(5)
,” gets expanded to the token list
((5)+1)
, yielding FL(((5)+1))
. Expanding the
outer FL
macro substitutes this argument for the parameter
x
in the substitution list, producing
((((5)+1))+1)
. The result should be reasonably intuitive.
The one thing to note is that because expansion of the inner
FL
happened in phase 1, FL
’s replacing bit was
clear and no tokens ever needed their unavailable bits set.
Now let’s look at a more interesting example:
Consider the first part of the token sequence, namely
ID(ID)
. We start in phase 1 by macro-expanding the inner
ID
, but since it’s a function-like macro not followed by
(
, cpp decides not to expand it. Hence, cpp replaces
arg
with ID
in the outer ID
’s
substitution list, and pushes the result onto the input list. Then it
sets macro ID
’s replacing bit and proceeds to phase 2
(rescan). Upon processing the first token, ID
, cpp will set
its unavailable bit (since ID
has replacing true) and not
expand it. Finally, cpp will clear ID
’s replacing bit, but
at this point there is nothing left to expand because the third
ID
is not followed by (
.
When we expand F()
, note that F_AGAIN
is
not followed by (
, so it does not get expanded as a macro.
Sure, one step later, PARENS
gets expanded to
()
, but at this point cpp has already output the token
F_AGAIN
, so it’s too late to decide to expand it. Hence,
the output of F()
—namely f F_AGAIN ()()
—may
contain an unexpanded macro call, but the unavailable bits are clear on
all tokens.
Now consider what happens when we call ID(F())
. Well,
fist we expand the argument F()
to
f F_AGAIN ()()
. Then we are done, so we clear
F
’s replacing bit. Next, ID
substitutes
f F_AGAIN ()()
for arg
in its substitution
list (namely the single token arg
). So the preprocessor
sets ID
’s replacing bit and rescans
f F_AGAIN ()()
, causing F_AGAIN
and then
F
to expand. But of course the same PARENS
trick prevents the second F_AGAIN
from getting
expanded.
Note that we’ve tweaked EXPAND
so that it handles macros
that output commas by simply using __VA_ARGS__
instead of a
named arg
.
The bulk of the work happens in
FOR_EACH_HELPER(macro, a1, ...)
, which applies
macro
to argument a1
, and then uses
__VA_OPT__
to recurse if the remaining arguments are not
empty. Just as in the previous section, it uses the PARENS
trick to enable recursion. The only catch, of course, is that we have to
keep re-scanning the macro, which is why the FOR_EACH
macro
wraps FOR_EACH_HELPER
in the EXPAND
macro we
saw before. For good measure, FOR_EACH
also uses
__VA_OPT__
to handle the case of an empty argument
list.
Will I use this in production code? I’m thinking about it. In my
first decade of C++ programming, I used to think that being a good C++
programmer was all about showing how clever you are. Now as a wise old
senior faculty member, I know that being a good C++ programmer is all
about showing restraint. You need to know both how to be clever
and when to be clever. So let’s do the cost-benefit analysis,
starting with the alternative approaches:
You could manually maintain separate enum
declarations and pretty-printer/scanner functions, with the risk that
they could get out of sink.
You could generate the C++ code using another program, but this
complicates the build process and typically doesn’t make the code any
more readable. C++ isn’t a great language for generating text, and if
you use perl or python or bash, the code won’t necessarily be more
transparent to other C++ programmers.
So I think the FOR_EACH
approach is actually a net win
over options 2 and 3. The most restrained option, which you should
always be considering in C++, is number 1.
How much are we paying in complexity for the use of
FOR_EACH
? It’s definitely tricky to understand how
FOR_EACH
works if you don’t know how cpp works. It’s also,
unfortunately, hard to figure out how cpp works. I was unable to
understand the C++ language
specification for macro replacement until I’d already understood how
cpp works. https://en.cppreference.com/ doesn’t get into nearly the
level of detail necessary. On the other hand, I now have this blog post
I can reference in my source code, so writing this post is actually part
of deciding whether or not I want to use the trick. Of course, others
should feel free to do the same… I place all the cpp macros in this blog
post in the public domain.
FOR_EACH
is also far from the grossest use of macros
I’ve seen. It doesn’t even use token pasting (##
) to
synthesize new tokens that you can’t textually search for. Even though
the implementation is tricky to understand, it’s at least short. More
importantly, the interface to FOR_EACH
is quite intuitive.
For a multi-line C macro, I think MAKE_ENUM
is fairly
readable. And once you employ FOR_EACH
in one place, you
can potentially amortize the complexity over other uses of the
macro.
Whatever you think of the trade-offs, this much is certain: the
introduction of __VA_OPT__
makes FOR_EACH