Monday 16 September 2013

Why are setjmp /volatile hacks STILL needed?

We all know the old gotcha of setjmp (but I'm going to recount it anyway) which is that when longjmp is executed (and setjmp appears to complete again, but this time with a non-zero result) some (or all) of the original register set will be restored.

Any local variables which were (possibly temporarily) stored in the local registers reserved for such use could appear to revert back to their previous values until they are re-freshed from the stack frame.

And so all pre-existing local variables that would be accessed after the longjmp should be declared volatile so that they will not be fetched from the registers.

Well... why is there no special #pragma or other option attached to the setjmp function indicating a register clobber-list to the compiler that all the local registers could have been modified and it should not depend on cached values any more?

Why do we need such a pragma? Why can't we #define setjmp to be something like:

#define setjmp(env) (setjmp(env) + __asm__ volatile ("mov %%eax,0" : : : "memory" "ebx" "ecx" "etc..." ));

I'm sure that example is insufficient, but I also think that some working method could be contrived, so I'll work on it...

and then...

OK, having tried; I realise that first problems are

  • not stopping the extra return from setjmp from using stale values in restored registers
  • that the correct values from the registers will not have been copied back into the stack frame, but will somewhere be pushed on the stack by a called function where we can't retrieve them

But the worst problem is that the optimiser can remove operations that ought to have been noticeable:


int getint() {
  static int i=3;


  return i;

int main(void) {
  static jmp_buf j;
  volatile int x;
  int y;
  x = getint();
  y = getint();
  if (setjmp(j) == 1) {
    asm volatile("nop" : : : "memory", "ecx", "edx");
    printf("%d %d\n", x, y);
    return 0;
//  printf("%d %d\n", x, y); // Second Printf
  longjmp(j, 1);

Unless the second printf is uncommented, the line for y++ is not emitted even at optimisation -O1
The reason is that y is not used after that point, even though the longjmp (acting somewhat like a goto) might have the effect of returning to a point where the y is used.

If I replace the longjmp with a goto then y++ is emitted in the code.

The first lesson: Use volatile in the way everyone says you should; all variables to be accessed in the second return of setjmp should be volatile.

The second lession, some contrivance of:
blah: if (setjmp(j) == 1) {
  if (never_true()) goto blah; else longjmp(j, 1);

generates the right code too by recognizing the effect of the goto rather than the longjmp however this cannot be sensibly managed as there can easily be multiple setjmp in a function.

So I still search for some means to cause a block of code to commit all temporary registers back to the stack frame before calling another function, and a means to cause all temporary registers to be flushed on second return from setjmp (although this may be covered by the first case).

further reading

I read that:
The returns_twice attribute tells the compiler that a function may return more than one time. The compiler will ensure that all registers are dead before calling such a function and will emit a warning about the variables that may be clobbered after the second return from the function. Examples of such functions are setjmp and vfork. The longjmp-like counterpart of such function, if any, might need to be marked with the noreturn attribute.
So in fact gcc at least does evict any registers before calling setjmp but that doesn't prevent local variables being cached later on and not comitted to the stack frame before longjmp is called.

So I really just need a way to mark that functions might call longjmp and that temporary registers should be evicted before calling; but I know deep down that this is rubbish as longjmp might be called even from a signal handler.

Thursday 12 September 2013

Can all for-loops be transformed to while-loops?

I previously wrote on using for as a brace-less scope using a trick by Jens Gustedt but I wanted to be able to propagate any break clause that might be used within that scope so that it would take effect in an enclosing loop or case statement.

I found a method that worked for gcc but which made use of it's compound statements.


main() {
  int a;
  for (a=1; a<=2; a++) {
    printf("Main context a=%d\n", a);

    for (int o = 0; o >=0; ({ if (o == 1) { printf("Detected break\n"); break; } }) )
      for (int i=0; !o && (o=1), i==0; o=-1, i=-1 ) { printf("Inner context\n"); break; }

Main context a=1
Inner context
Detected break

which shows that the break statement in the inner-context was propagated to take effect in the top level loop, by means of the break statement in the compound statement of the second loop.

Thats nice, and it is the intended effect, but a for-loop of this form

for ( expression-1 ; expression-2 ; expression-3 ) statement;

is meant to be equivalent to this while loop:

expression-1 ;
while ( expression-2) {
  expression-3 ;

In my case, expression-3 consisted of ({ if (o == 1) { printf("Detected break\n"); break; } }) and the break clause took affect in the containing scope - no doubt because it was not part of the statement of it's associated for-loop.

But it would transform into this while-loop:

expression-1 ;
while ( expression-2) {
  printf("Detected break\n"); break ;

Can there be any doubt that in this while-loop, the break statement would terminate the loop itself and not any containing scope? I don't think so.

Therefore gcc compound statements in for loops open the door to high-class trickery which cannot be achieved the normal way.

using sed to split a stream into 2 streams

An expensive file listing operation needs to invoke an action on the listed files.

xargs is normally the candidate for that, but what when there are multiple file types with varied actions?

Normally I would pipe into a bash scriptlet like this

... | while read "$file" ; do if [ $(expr "$file" :  "$pattern" ) = "0" ] ; then ... ; else ...

but it lacks the bulk appeal of xargs which can reduce the number of command invocations by thousands of times for a large file list.

So here I make use of sed, and bash's >( ... ) construct to open a subshell and substitute a magic filename that refers a file descriptor that writes to the input of the subshell. (The substituted filename is typically something like /dev/fd/63). The newline can be entered on a terminal session with ^V ^J. It is also essential that there are no spaces between the ' and >( and also between the ) and ', otherwise the sed script will be presented to sed as multiple arguments instead of one argument.

... | sed -e '/\.ko$/{w'>( xargs strip --strip-debug )'
;d}' | xargs strip

This allows kernel objects to be stripped of debug only but other objects to be stripped entirely.

An alternative would be to use tee and a separate grep

... | tee >( grep '\.ko$' | xargs strip --strip-debug ) | grep -v '\.ko$' | xargs strip