Thursday 27 October 2022

Is Rust sacrificing 3U safety for stability?

My original post is too malign or pejorative or something, apparently imputing dark motives when I mean only to highlight the ostensible appearance. So I try again.

[This rant is a work in Progress and will receive regular updates]

TL;DR: This is a general call for prioritisation of breaking changes over preservation of unexpected undetectable unsafe or undefined behaviour in Rust, and for full transparency with respect to such faults. (And maybe there is and I haven't seen it).

Will Rust sacrifice 3U safety for stability?

3U safety refers to safety from unexpected undetectable unsafe behaviours (or, if you prefer, unexpected undetectable undefined behaviours).

The utility of C and other ancient software development tools harks back to those ages when they could be compared to that dog walking on its hind legs: The wonder isn't that it does it well, but that it does it at all. And in comparison to what preceded them, they did it well.

But it's no longer enough.

The instability of such tools has been a major hindrance to the development and maintenance of secure software.

Youthfully optimistic C developers blithely consider C to be a portable macro assembler. The experience that comes with time teaches them that it isn't. 

C compilers have been secretly throwing away increasing chunks of code for decades, and not just in simple cases like throwing away sub-statement code by remembering that a variable is already in a register, or throwing away whole statements like function calls that zero a sensitive buffer (because it wasn't read-from again), any new compiler edition might throw away whole blocks of code that you inserted for security checks, to fix bugs -- and not tell you. 

This is not an attack on C compiler authors or maintainers but is to illustrate (due to insufficient congruence of priorities) the shifting sands on which much software has been, and is being, built.

Will Rust go the same way?

Stability or Speed?

In a much-contested view (especially by compiler authors) D. J. Bernstein controversially wrote in 2015:

...gcc and clang both feel entitled to arbitrarily change the behavior of "undefined" programs. Pretty much every real-world C program is "undefined" according to the C "standard", and new compiler "optimizations" often produce new security holes in the resulting object code, as illustrated by

and many other examples. Crypto code isn't magically immune to this--- one can easily see how today's crypto code audits will be compromised by tomorrow's compiler optimizations, even if the code is slightly too complicated for today's compilers to screw up.

If you want the short version of the dispute, C compilers take advantage of what to developers is an uncontrollably increasing collection of what is termed "undefined behaviour" in order to make speed optimizations because... speed is obviously what you want, and has been one of the main metrics in the compiler wars. (Smallness of output being the other main metric).

It plays out like this: There is an ever-growing bunch of rules that developers must adhere to, and if they don't, the compiler can do what it likes, and it's your fault. When we say do what it likes we mean things like: ignore your code. And compiler writers stick to that like glue, even when it makes it exceedingly difficult to write secure software, and when the introduction of new undefined behaviour makes existing secure software suddenly become insecure when compiled with a new compiler. 

You have the responsibility to look at the compiler changes list and put all the right flags in your makefile to get the old behaviour. Flags that the old compiler won't recognize. How often have you seen a makefile that assembled CFLAGS flags based on the compiler version?

Now the language designers and compiler authors are right because they make the rules, but the fight to produce correct and safe software becomes more and more like a fight, and every edition of the compiler tools behaves more and more like a fully automatic foot-gun - all the better to shoot yourself in the foot with, and the bullets you fire at security problems are optimised away.

Clearly, compiler authors and compiler users have conflicting views on what the compiler should do. And if you don't believe me, read this sorry tale of the assertions that were optimized away.

I don't exaggerate. New tools will produce worse code. And the compiler doesn't warn you. Because in your makefile you forgot to turn on the new warning flag that didn't exist when you wrote the makefile.

Simply recompiling previously working, secure code with a newer version of the compiler can introduce security vulnerabilities. While the new behaviour can be disabled with a flag, existing makefiles do not have that flag set, obviously. And since no warning is produced, it is not obvious to the developer that the previously reasonable behaviour has changed.

I think I want a compiler that warns me when it throws away lines of code that I took the trouble to write and debug, but what do I know? It's really hard. But so is having your code silently yanked away.

Consider what happens when the default code optimisation settings change in the next release of the compiler. As one chap puts it:

Value range propagation now assumes that the this pointer of C++ member functions is non-null. This eliminates common null pointer checks but also breaks some non-conforming code-bases (such as Qt-5, Chromium, KDevelop). As a temporary work-around -fno-delete-null-pointer-checks can be used. Wrong code can be identified by using -fsanitize=undefined.

http://blog.fefe.de/?ts=a9de792d 

A more even-handed treatment of what compiler authors are trying to achieve is given here: https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759de5a7 

It all comes down to a persistent and ever-escalating mismatch of expectations. The scarcely known obligation upon software developers is to be aware of the increasing instances of the increasing definitions of undefined behaviour that exist in existing debugged safe and working, and of the changes to default optimisations, so that with each compiler release they must remove the last year's best practice and replace it with this year's best practice.

And our current state is where:

...the root cause of approximately 70% of security vulnerabilities that Microsoft fixes and assigns a CVE (Common Vulnerabilities and Exposures) are due to memory safety issues. ​

This is despite mitigations including intense code review, training, static analysis, and more.​

While many experienced programmers can write correct systems-level code, it’s clear that no matter the amount of mitigations put in place, it is near impossible to write memory-safe code using traditional systems-level programming languages at scale.

we-need-a-safer-systems-programming-language

Let's expand upon those mitigations. Everyone is compiling C with -Werror and -Wall, but it's not enough. Many look harder by paying good time and money for Coverity, Klocwork, Black Duck, Code Sonar, etc, and trying every C compiler they can get their hands on for maximum warnings, and the information on every change that they can make.

If the compiler authors insist that they are correct (as they may) then the software developers need new tools. For me, and for the author of that previous quote, Rust is that tool.

From C, we have had one long round of compiler-imposed instability in pursuit of speed, and another more recent self-imposed round of instability for safety.

So to answer the question heading this section: 

If you want stability (I mean such as for security bugs to stay fixed) the answer is clear: do not update your compiler tools.

If you want speed, you need the latest compiler tools. Your programs will be very fast, but you will not know what they are doing. Of course, there are ways to find out, but the answer will vary from one compiler edition to the next.

In the question of stability or speed, the C compiler maintainers chose speed for too many decades.

What about Rust?

Stability or Safety?

Don't think that Rust isn't optimising your code. It is. The most popular Rust implementation is based on Clang/LLVM and has various bugs originating in the optimising compiler backend, throwing away code that Rust wants to keep.

But Rust makes certain guarantees of safety, in the same way that C promises to throw away apparently arbitrary multi-line chunks of code. 

You can see the failure list of what are termed Unsound Bugs: Today I see 65 open and 307 closed. If I exclude C bugs from that list then I see 10 open and 86 closed. (I maybe misusing the bug filter system).





No comments:

Post a Comment