Thursday 18 May 2017

Macro detection overload - Special Access Macros

This article talks about how one macro can call another macro if it exists, but if not, it can fallback to a default macro. This can be useful for behaviour vectoring or interception, debugging macros, etc.

Preferring Special Access Macros, but falling back to default macros

It would be nice if a C pre-processor macro can tell if another macro is defined.

It can... in a way. If you know in advance the name of the other macro you are looking it, you can do some comparison tricks as explained by Paul Fultz II in his C Preprocessor tricks, tips, and idioms.

But what if you don't know in advance what it will be called?

What if you are composing a combination of C macros and functions, where the developer can override a special case of the default behaviour for reasons of efficiency?

Maybe you have a cross-bar / plug-board, with data from a variety of sources, to undergo a variety of processing and be sent to a variety of destinations. Maybe you have a general route from a to b but sometimes want to drop in a specific optimisation.

Maybe you want to call a macro process(data, handler) where handler is the name of a macro or function used to emit the slowly generically processed data. Maybe process itself is actually an argument passed to you.

e.g.
#define work(process, data, handler) process(data, handler)
...
work(mangle, input, archive);
work(mangle, input, deliver);
A conceptual mangle might be defined (using GCC / clang statement expressions):
#define mangle(handler, data) ({
  for(int x, int i=0; ITERATATE_DATA(&x, i, data); i++) {
     handler(x);
  }
})
but perhaps it's not efficient for mangle to call archive for each processed input element, and we want to have a specialised function to mangle the input for archive instead.

After all there is a big difference between dest[index++]=c; and http_post_value(dest, c); when being called on every piece of data!

Of course the archive handler is simply a macro name and can't be passed as a parameter to a function (this isn't C++ templates you know...).

But that's OK, we write a specific function to mangle the input for the archive output, by calling the archive macro inside the function:
void mangle_archive(void* data) {
  void* buffer = bulk_mangle_to_buffer(data);
  archive(buffer);
  mangle_free(buffer);
}
That's nice.

But we'd like work(mangle, input, archive) to expand to call function mangle_archive(data) but we'd still like work(mangle, input, deliver) to expand to call the generic macro mangle(data, deliver) because it's not worth writing a more efficient handler to that. (But we might write one later, and if we did, we would want it to just work).

Can we do this? The answer is yes!

Whoa!

Now the difficulty is that (unlike in Paul's example) that we don't know in advance all the different handlers or processes that the user might have, so we can't go declaring in advance a load of macros like COMPARE_foo in advance. (Furthermore this is an ugly requirement to put on the author.)

However, in our case we can require that the special macro body (with which to override the default behaviour) should be entirely enclosed in parenthesis.

So while mangle_archive may well ultimately be a function, we will also define a macro wrapper thus:
#define mangle_archive(...) (mangle_archive(__VA_ARGS__))
Or something like that. As long as it is a macro whose body is in parenthesis and whose name is formed by joining mangle with archive. How does this help?

We have mangle and archive as input parameters, and if we can construct the name mangle_archive from mangle and archive, and then attempt to invoke it (as a macro, with the right number of parameters) it would result in (mangle_archive(...)) which (as the more enlightened will excitedly note) is an expression in parenthesis.

As anybody who has been reading Fultz2Heathcote and Gustedt will know, a well formed expression contained within parenthesis can be made to disappear entirely if prefixed by something such as the name of a macro which was defined thus:
#define VANISH(...)
e.g., with the right amount of coaxing:
VANISH (mangle_archive(__VA_ARGS__))
is quite easily persuaded to be nothing at all. And we can detect nothing at all. (See Jen's piece Detect empty macro arguments).

Clearly, if mangle_archive were not defined, we would have:
VANISH mangle_archive(__VA_ARGS__) which clearly won't be persuaded to vanish, and if mangle_archive were defined to something (not entirely) enclosed in parenthesis, we might have:
VANISH something (not entirely) enclosed in parenthesis
which clearly won't vanish either.

So to detect if the developer has defined a valid overriding macro, we simply attempt to invoke the macro with a VANISHing prefix and then use standard mechanisms already discovered to see if anything is left. (Using EVAL, naturally).

The actual trick is that rather than
#define VANISH(...)
we leave a comma (thanks Jens)
#define MAGIC_EAT_PARENTHESIS(...) ,
And put this at the start of an argument list to be followed by
MAGIC_MAKE_NAME(FUNC, IMPLEMENTATION),FUNC
We then grab the second argument with
#define MAGIC_DETECT_MACRO_(_, RESULT, ...) RESULT
and whether that is the concatenation of the process and handler, or just the process depends on whether or not an extra comma was inserted which would offset the first argument to become the second argument.

The resultant call to the overriding macro is called with the same arguments as would be supplied to the default macro, including the name of the handler. Of course, the handler is implicit now, being defined in the overriding function body, so the overriding macro can dispose of this (perhaps after a static assert) before calling the overriding function.
The simplified full solution is here.

MAGIC_DISPATCH takes at least two or three arguments, which are:

  1. default process name
  2. output handler, which when joined to the default process name becomes a potential overriding process name
  3. other arguments to pass to whichever of the processes is selected, should include at least the handler as the default macro will need to know where to send the data.
#define MAGIC_MAKE_NAME_(a, b) a##_##b
#define MAGIC_MAKE_NAME(a, b) MAGIC_MAKE_NAME_(a, b)
#define MAGIC_EMPTY()
#define MAGIC_DEFER(id) id MAGIC_EMPTY()
#define MAGIC_EVAL(...) __VA_ARGS__
#define MAGIC_EAT_PARENTHESIS(...) ,
#define MAGIC_DETECT_MACRO_(_, RESULT, ...) RESULT
#define MAGIC_DETECT_MACRO(...) MAGIC_DETECT_MACRO_(__VA_ARGS__)
#define MAGIC_DISPATCH_(FUNC, IMPLEMENTATION, ...) \
        MAGIC_DEFER(MAGIC_DETECT_MACRO) \
              (MAGIC_EAT_PARENTHESIS MAGIC_MAKE_NAME(FUNC, IMPLEMENTATION)(__VA_ARGS__) \
               MAGIC_MAKE_NAME(FUNC, IMPLEMENTATION),FUNC)\
        (__VA_ARGS__)

#define MAGIC_DISPATCH(...) MAGIC_EVAL(MAGIC_DISPATCH_(__VA_ARGS__))

#define work(process, data, handler) MAGIC_DISPATCH(process, handler, handler, data)
#define mangle_archive(handler, ...) (STATIC_ASSERT(#handler == "archive"), do_mangle_archive(__VA_ARGS__))

work(mangle, data, archive); // This will be overridden with mangle_archive
work(mangle, data, deliver); // This will call the default mangle
giving the output:
(STATIC_ASSERT("archive" == "archive"), do_mangle_archive(data));
mangle(deliver, data);
A last minute entry:
#define mangle_deliver(handler, data) (crunch_data(data))
work(mangle, data, deliver); // this will be overridden with mangle_deliver
gives:
(crunch_data(data));
If this looks too theoretical, consider that handler could be a list whose first member is the macro to call and whose other members could be trailing (or leading) arguments that are inserted into the process invocation. Such a handler would be decomposed by the invoker work to form a correct argument set for MAGIC_DISPATCH.

In such decomposition, these macros may prove useful.
#define MAGIC_LIST(...) (__VA_ARGS__)
#define MAGIC_UNLIST(...) __VA_ARGS__ 
#define MAGIC_UNLIST_FIRST(FIRST, ...) FIRST
though a more powerful level of EVAL may be required.

Much thanks to Paul Fultz II (C Preprocessor tricks, tips, and idioms), Jonathan Heathcote ( C Pre-Processor Magic), and particularly Jens Gustedt (P99 Macros for Emulation of C11), for documenting their work, which I have read, confused over, and slept on for many a night.