Monday, 28 January 2013

gensym and unwind-protect for plain C

I wanted some simple C (I mean gcc's C, of course) version of unwind-protect, a clean-up block that was guaranteed to execute when scope for function finished. And a nice way to express it.

How does this look?

void do_test3() {
  int y=3;
  printf("hello\n");
  scope_cleanup {
    printf("cleaning up %d\n", y);
  }
  { int z=2;
    scope_cleanup { printf("inner cleanup %d\n", z); }
  }
  printf("bye\n");
}

main() {
  do_test3();
}

And when run it emits:

hello
inner cleanup 2
bye
cleaning up 3

Well darn tootin'! And you can have more than one in a function or scope too, without conflict!

How 'tis done

It's only half the story, but of course the heavy lifting is done with gcc's non-standard __attribute((cleanup(...))) which can be appended to any variable definition causing the function whose name is at ... to be called (with a pointer to the variable) when the variable goes out of scope.

Anyone who has used this feature will know how useful (though cumbersome) it is; e.g.

void free_ptr(void* ptr) {
  free(*(void**)ptr);
}
...
void* ptr=malloc(MAX_BUFFER) __attribute((cleanup(free_ptr)));

or

void close_ptr(void* ptr) {
  close(*(int*)ptr);
}
...
int fd =open(path, O_RDONY) __attribute((cleanup(close_ptr)));

But what I really want is to be able to specify is not a function but a clean-up block of code, and not tied to any particular variable, but which is executed when the scope end - like unwind-protect and such,

Something like that can be implemented using this method, but I was looking up nested function definitions in C today (I don't remember why) and I noticed that nested functions can be pre-declared using auto which made it possible to define a macro that would expand to this:

auto void cleanup (void* x);
int cleanup_var __attribute((cleanup(cleanup)));
void cleanup(void* x) ...

Of course the ... isn't part of the definition, it is the point at which a block can be expressed; which will be incorporated into the body of a nested function hastily called cleanup in this example.

Without pre-declaring the cleanup method, the cleanup method must be defined before the variable is declared, making it ugly having to pass the code block as a macro parameter.

gensym

Clearly it is easily possible to over-use the variable name cleanup_var and the function name cleanup simply by doing this more than once per function; what is needed is a way to generate unique symbol names in the C preprocessor.

It isn't possible, sadly (not really! got you there!) because the pre-processor cannot manage counters. But, (thinks I) we all know that a macro expands out to a single line and so one pre-processor symbol that always changes but remains the same for an expansion is __LINE__ (and "that be so" says you)
* see also __COUNTER__

So we have these definitions for gensym

#define GENSYM_CONCAT(name, salt) name ## salt
#define GENSYM2(name, salt) GENSYM_CONCAT(name, salt)
#define GENSYM(name) GENSYM2(name, __LINE__)

And this definition for cleanup (for those that couldn't work it out)

#define scope_cleanup auto void GENSYM(__cleanup__) (void* x); \
int GENSYM(__cleanup_var__) __attribute((cleanup(GENSYM(__cleanup__)))); \
void GENSYM(__cleanup__)(void* x)

of course, I am rather abusing the meaning of gensym by having it return the same name each time... the proper way would be to call gensym once and pass it's result to another macro but that would increase the risk of shadowing existing definitions, for which reason we are using gensym in the first place.

The original example expands out something like:

void do_test3() {
  int y=3;
  printf("hello\n");
  auto void __cleanup_29 (void* x); int __cleanup_var_33 __attribute((cleanup(__cleanup_29))); void __cleanup_29(void* x) {
    printf("cleaning up %d\n", y);
  }
  { int z=2;
    auto void __cleanup_33 (void* x); int __cleanup_var_33 __attribute((cleanup(__cleanup_33))); void __cleanup_33(void* x) { printf("inner cleanup %d\n", z); }
  }
  printf("bye\n");
}


3 comments:

  1. Of course because I use __LINE__ to construct "unique" identifiers it means you can't use this cleanup macro more than once inside another macro expansion.

    I should provide a version that lets the writer specify an alternate __cleanup__ prefix for such cases

    ReplyDelete
  2. Since I wrote this I came across another construct for returning anonymous function pointers: http://www.velocityreviews.com/forums/t730159-lambdas-in-gcc.html

    #define lambda(return_type, body_and_args) \
    ({ \
    return_type __fn__ body_and_args \
    __fn__; \
    })


    ReplyDelete
  3. But this is perhaps more portable

    https://github.com/Leushenko/C99-Lambda

    ReplyDelete