Monday 16 September 2013

Why are setjmp /volatile hacks STILL needed?

We all know the old gotcha of setjmp (but I'm going to recount it anyway) which is that when longjmp is executed (and setjmp appears to complete again, but this time with a non-zero result) some (or all) of the original register set will be restored.

Any local variables which were (possibly temporarily) stored in the local registers reserved for such use could appear to revert back to their previous values until they are re-freshed from the stack frame.

And so all pre-existing local variables that would be accessed after the longjmp should be declared volatile so that they will not be fetched from the registers.

Well... why is there no special #pragma or other option attached to the setjmp function indicating a register clobber-list to the compiler that all the local registers could have been modified and it should not depend on cached values any more?

Why do we need such a pragma? Why can't we #define setjmp to be something like:

#define setjmp(env) (setjmp(env) + __asm__ volatile ("mov %%eax,0" : : : "memory" "ebx" "ecx" "etc..." ));

I'm sure that example is insufficient, but I also think that some working method could be contrived, so I'll work on it...

and then...


OK, having tried; I realise that first problems are

  • not stopping the extra return from setjmp from using stale values in restored registers
  • that the correct values from the registers will not have been copied back into the stack frame, but will somewhere be pushed on the stack by a called function where we can't retrieve them

But the worst problem is that the optimiser can remove operations that ought to have been noticeable:

#include 
#include 

int getint() {
  static int i=3;

  i+=2;

  return i;
}

int main(void) {
  static jmp_buf j;
  volatile int x;
  int y;
  x = getint();
  y = getint();
  if (setjmp(j) == 1) {
    asm volatile("nop" : : : "memory", "ecx", "edx");
    printf("%d %d\n", x, y);
    return 0;
  }
  y++;
  x++;
//  printf("%d %d\n", x, y); // Second Printf
  longjmp(j, 1);
}

Unless the second printf is uncommented, the line for y++ is not emitted even at optimisation -O1
The reason is that y is not used after that point, even though the longjmp (acting somewhat like a goto) might have the effect of returning to a point where the y is used.

If I replace the longjmp with a goto then y++ is emitted in the code.

The first lesson: Use volatile in the way everyone says you should; all variables to be accessed in the second return of setjmp should be volatile.

The second lession, some contrivance of:
blah: if (setjmp(j) == 1) {
...
  if (never_true()) goto blah; else longjmp(j, 1);

generates the right code too by recognizing the effect of the goto rather than the longjmp however this cannot be sensibly managed as there can easily be multiple setjmp in a function.

So I still search for some means to cause a block of code to commit all temporary registers back to the stack frame before calling another function, and a means to cause all temporary registers to be flushed on second return from setjmp (although this may be covered by the first case).

further reading


I read http://stackoverflow.com/a/7734313 that:
The returns_twice attribute tells the compiler that a function may return more than one time. The compiler will ensure that all registers are dead before calling such a function and will emit a warning about the variables that may be clobbered after the second return from the function. Examples of such functions are setjmp and vfork. The longjmp-like counterpart of such function, if any, might need to be marked with the noreturn attribute.
So in fact gcc at least does evict any registers before calling setjmp but that doesn't prevent local variables being cached later on and not comitted to the stack frame before longjmp is called.

So I really just need a way to mark that functions might call longjmp and that temporary registers should be evicted before calling; but I know deep down that this is rubbish as longjmp might be called even from a signal handler.

3 comments:

  1. Wow:

    "You cannot portably store the result of setjmp() into a variable. The C standard says: "_An invocation of the setjmp macro shall appear only in one of the following contexts: the entire controlling expression of a selection or iteration statement; one operand of a relational or equality operator with the other operand an integer constant expression, with the resulting expression being the entire controlling expression of a selection or iteration statement; the operand of a unary ! operator with the resulting expression being [...]; or the entire expression of an expression statement [...]." – Jonathan Leffler Nov 7 '09 at 20:53"

    http://stackoverflow.com/questions/1692814/exception-handling-in-c-what-is-the-use-of-setjmp-returning-0#comment1569548_1692864

    ReplyDelete
  2. I just learned that C++ exceptions are rubbish anyway so I should stop worrying about my hacked up ones not being "as good"

    Exceptions are good, it seems, only for properly attributing and reporting faults, and then aborting

    http://yosefk.com/c++fqa/exceptions.html#fqa-17.1

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete