Sex, Drugs & Compiler Construction

 

The PHP strtod() denial of service bug

posted by andreas in Computer Languages

This rant is dedicated to my favourite gcc moronmaintainer, Andrew Pinski.

Hang on a second, you will say. Andrew Pinski? Not Rasmus Lerdorf? It's a PHP bug, and Rasmus Lerdorf is the lead developer there, right?

Well, PHP does have a bad street rep when it comes to security, and for a reason. However, in this instance, my verdict is "not guilty". This is a gcc bug. Let me show you why.

Precise reading and printing of floating point numbers is surprisingly complex. Complex enough so that people publish papers about their approach (William D. Clinger, "How to Read Floating Point Numbers Accurately", Proc. ACM SIGPLAN '90; Guy L. Steele, Jr. and Jon L. White, "How to Print Floating-Point Numbers Accurately", Proc. ACM SIGPLAN '90). Complex enough that real-world code doing it spans almost 2700 lines of C code, and complex enough that everybody is happy that a certain David M. Gay, working at the AT&T Research Lab, wrote and published code to do this in 1991.

This code is well-analyzed, well-tested and used pervasively. It speaks for Rasmus and the other PHP maintainers that the code handling floats is this code, and not something hand-rolled with dubious precision. To get an idea of just how pervasive usage of this code snippet is, ask google code search for the author's address. It's everywhere: Android libc, gcc libio, gcc java runtime, newlib libc, GNU Mono, Apple's libc, mozilla, etc., etc.

Now, you might ask yourself, if everybody's using this code, why does it fail? I was wondering what exactly went wrong there, having had a passing experience of trying to implement a float reader for a programming language other than C. A look at the PHP bug report enlightened me. What's going on is that there is a loop approximating the correct float value for strtod(), and it contains a comparison of the old and new approximation, to see whether it has reached the termination point. Of course, when dmg wrote that code back in 1991, he thought long and hard about the semantics of IEEE double floats (and even VAX and IBM floats, which behave a bit differently). However, the thing that goes wrong is that the x86 floating point coprocessor actually uses 80 bit float precision internally, and only rounds back to IEEE precision on register or memory store, and, drumroll, gcc exposes this!

To be more precise, what happens is that the gcc optimizer skips stores and loads when values are used again in subsequent computations. This in itself were all fine, all optimizers do that. However, all compilers but gcc make sure that before doing any comparisons on floating point values, these values are rounded back to 64 bits, in order to enforce IEEE semantics. I encourage you to go read the gcc bug report. Their argument for not doing it is, and I'm not making this up, guys, is that this is "the result of excess precision in the FPU". And then they go on and give workarounds that are extremely messy (setting a certain bit in the FPU control register) or just plain wrong (-ffloat-store doesn't help in all situations). And in the next step, about a hundred or so similiar bug reports within the course of ten years, every single one representing a developer who painfully hunted a bug for days, are closed with a laconic "duplicate".

Scattered within the closed duplicates, there are people trying to raise meaningful arguments. My favourite one, which really drives the point home, was given by Jonathan Knispel:

Calling a trivial function twice with an identical parameter may return a value less than, equal to or greater than itself (depending on register allocation in the context of the calls). This is a really nasty piece of nondeterministic behaviour. Weren't compilers and "high level" languages invented to get around exactly this kind of hardware dependency?
Exactly, Jonathan, exactly. Ponder that thought for a moment, dear reader. Writing the same expression twice in gcc is not guaranteed to produce identical results. The gcc maintainers know and don't care for your pain.

But given that this bug has been known for ten years, given that reason has failed to change anything about this situation, given that the single instance of this bug in the PHP strtod() function led to a denial of service situation that forced millions of users to upgrade their installation, given that the very same bug in the very same code lingers everywhere, time is over for politeness and reason. Dear gcc maintainers: you are a bunch of fucking ignorant morons. And Andrew Pinski, you are the worst of them all. Your arrogance has cost millions, and will continue to do so. Please get your collective heads out of your asses, and go fix this bug, before it does more damage. Update; it turns out that gcc 4.5 actually provides a -fexcess-precision=standard switch, that does the sensible thing.

On the other hand, maybe even that was a waste of breath. Maybe we'd all just rather do the smart thing, and move to more sensible compilers. There's clang for those who want and need a feature-complete C/C++/ObjC compiler, but can tolerate a few bugs, and there's CompCert for those hackers who can do with a subset of C, but have low tolerance for compiler bugs.

And of course, there's a whole range of interesting compilers and languages that are not C.