Undefined Behavior Begone!
Written by Harry Fairhead   
Wednesday, 02 April 2025

C++ guru Herb Sutter has a new take on taming the UB monsters in C++, but there is a sense in which the monster is of our own creation and slaying it isn't essential - just tell it to begone.

The demon inside C++, and to a lesser extent in C, is undefined behavior or UB. As you probably know, UB is just a language construct that has an undefined interpretation. If you write such code then you have just written a non-program because UB should never occur in a run-able program. 

This of course is complete nonsense.

We seem to be the victims of a misunderstanding of an intial attempt at machine independence and one that has been commandeered by a group of programmers intent on something quite different from creating run-able programs in C/C++.

Back in the days when C was being formalized, it was thought to be OK to leave some constructs unspecified so as to let the hardware the machine was running on "decide" on what they meant. For example, the specification of negative number formats wasn't included in the language - it was UB - but if you did some arithmetic you expected to get the right answer even if the machine used one's complement, two's complement or sign magnitude etc. However, many bitwise operations are different depending on the representation and these too are UB.

This is fine because generally C programmers know the architecture of the target machine and will adjust what they do according to how the UB pans out. The point is that the behavior is only undefined in the specification or standard, it isn't undefined in any sense in the real world. The hardware always makes UB very defined.

Now the problem is that a very large group of programmers misunderstood UB. The compiler writers, in particular, noticed that if UB was really intended to be undefined then they could treat it as defining anything and any program that contained UB could be compiled to anything - or nothing even - and all in the spirit of optimization. This is simply crazy and deep down we all know that it is and it is a joke.

Of course, UB in C++ is a little more subtle as many C++ programmers don't know the architecture of the machine that they are targeting and hence UB should be avoided, but when it occurs it still should be left to the machine to decide what happens not the compiler.

The solution is obvious - get rid of UB by finding all instances of it and renaming it "machine defined" or something similar.

This is so simple that everyone seems to think it's impossible and indeed Sutter's new post starts off from this position. He notes that there has been a lot of progress in removing UB from the language libraries. He credits the use of constexpr to detect UB at compile time rather than runtime - but only if the code is determined at compile time and much (most?) isn't. Then there is a list of UBs that have been eliminated from language features -  uninitalized variables, adding access bounds to data types and so on.

Not much of this is new, but his final point offers some hope:

there are proposers and volunteers to

  • systematically catalog language UB,
  • specify a way to eliminate the UB (make it illegal, or well-defined including where necessary with a run-time check such as a bounds check),
  • make that elimination happen preferably all the time where it’s efficient enough (as C++26 is doing for uninitialized local variables) or else under a named group that’s easy to opt into (profile name, or contract label name), and
  • realizing that different UB cases need to be addressed in different ways, and we’re willing to put in the effort… no magic wand, Just Engineering.

So at last we have a sane solution to an insane problem. There never should have been UB in C or C++ and now the compiler writers and language standards writers need to get the work done. The machine has no undefined behavior - programming, up to hardware problems and issues of timing, is always determinstitc.

My final word, read Sutter's post. It has a lot more subtlety, humor and information than my take on the matter.

Ccoverdetail

More Information

Crate-training Tiamat, un-calling Cthulhu:Taming the UB monsters in C++

Related Articles

C Undefined Behavior - Depressing and Terrifying (Updated)

C Pointer Declaration And Dereferencing

C23 - What We Have To Suffer

GCC Gets An Award From ACM And A Blast From Linus        

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

 

Banner


Amazon Bedrock Powered Up By New AI Models
06/03/2025

Constantly updating its compatible foundation models list, Amazon Bedrock has added to its offerings, including Anthropic's new Claude Sonnet 3.7.



Apple Adds Swift Version Manager
03/04/2025

Apple has announced the first stable release of Swiftly, a Swift version manager for installing, managing and updating Swift toolchains.


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Wednesday, 02 April 2025 )