Celebrating the 30-th anniversary of the first C++ compiler: let’s find bugs in it

Cfront is a C++ compiler which came into existence in 1983, and was developed by Bjarne Stroustrup. At that time it was known as “C with Classes”. Cfront had a complete parser, symbol tables, and built a tree for each class, function, etc. Cfront was based on CPre. Cfront defined the language until circa 1990. Many of the obscure corner cases in C++, are related to the Cfront implementation limitations. The reason for this, is that Cfront performed translation from C++ to C. In short, Cfront is a sacred artifact for a C++ programmer. So I just couldn’t help checking such a project.

image1

Introduction

The idea to check Cfront occurred to me after reading an article devoted to the 30-th anniversary of the first Release version of this compiler: “30 YEARS OF C++“. I contacted Bjarne Stroustrup to get the source code of Cfront. For some reason I thought it would be a great hassle getting the code; but it turned out to be quite easy. This source code is open, available for everybody and can be found here: http://www.softwarepreservation.org/projects/c_plus_plus/

image2

I’ve decided to check the first commercial version of Cfront, released in October, 1983 as it’s this version that turned 30 this year.

Bjarne warned me that checking Cfront could be troublesome:

Please remember this is *very* old software designed to run on a 1MB 1MHz machine, and also used on original PCs (640KB). It was also done by one person (me) as only part of my full time job.

Indeed, to check such a project was impossible. At that time, for instance, to separate a class name from a function name they used a simple dot (.) instead of double colon (::). For example:

inline Pptr type.addrof() { return new ptr(PTR,this,0); }

Our PVS-Studio analyzer wasn’t ready for this. So I had to ask our colleague to look through the code, and correct such spots manually. It really helped, although there still were some troubles. When the analyzer was checking some fragments, at times it got quite confused, and was refusing to do the analysis. Nevertheless, I did manage to check the project.

I should say right away, I haven’t found anything crucial. I think there are 3 reasons why PVS-Studio hasn’t found serious bugs:

  1. The project size is small. It’s just 100 KLOC in 143 files.
  2. The code is of high quality.
  3. PVS-Studio analyzer didn’t understand some fragments of the code.

“Talk is cheap. Show me the code” (c) Linus Torvalds

So, enough talking. I guess the readers are here to see at least one error of THE Stroustrup. Let’s have a look at the code.

Fragment 1.

typedef class classdef * Pclass;

#define PERM(p) p->permanent=1

Pexpr expr.typ(Ptable tbl)
{
  ....
  Pclass cl;
  ....
  cl = (Pclass) nn->tp;
  PERM(cl);
  if (cl == 0) error('i',"%k %s'sT missing",CLASS,s);
  ....
}

PVS-Studio warning: V595 The ‘cl’ pointer was utilized before it was verified against nullptr. Check lines: 927, 928. expr.c 927

The ‘cl’ pointer can be equal to NULL. The if (cl == 0) check indicates that. What’s worse is that this pointer gets dereferenced before this check. It occurs in the PERM macro.

So if we open the macro, we get:

cl = (Pclass) nn->tp;
cl->permanent=1
if (cl == 0) error('i',"%k %s'sT missing",CLASS,s);

Fragment 2.

The same here. The pointer was dereferenced, and only then was it checked:

Pname name.normalize(Pbase b, Pblock bl, bit cast)
{
  ....
  Pname n;
  Pname nn;
  TOK stc = b->b_sto;
  bit tpdf = b->b_typedef;
  bit inli = b->b_inline;
  bit virt = b->b_virtual;
  Pfct f;
  Pname nx;
  if (b == 0) error('i',"%d->N.normalize(0)",this);
  ....
}

PVS-Studio warning: V595 The ‘b’ pointer was utilized before it was verified against nullptr. Check lines: 608, 615. norm.c 608

Fragment 3.

int error(int t, loc* lc, char* s ...)
{
  ....
  if (in_error++)
    if (t!='t' || 4<in_error) {
      fprintf(stderr,"\nUPS!, error while handling error\n");
      ext(13);
    }
  else if (t == 't')
    t = 'i';
  ....
}

PVS-Studio warning: V563 It is possible that this ‘else’ branch must apply to the previous ‘if’ statement. error.c 164

I am not sure if there is an error here or not, but the code is formatted incorrectly. ‘Else’ refers to the closest ‘if’. This is why the code doesn’t execute in the way it should. If we format it, we’ll have:

if (in_error++)
  if (t!='t' || 4<in_error) {
    fprintf(stderr,"\nUPS!, error while handling error\n");
    ext(13);
  } else if (t == 't')
    t = 'i';

Fragment 4.

extern
genericerror(int n, char* s)
{
  fprintf(stderr,"%s\n",
          s?s:"error in generic library function",n);
  abort(111);
  return 0;
};

PVS-Studio warning: V576 Incorrect format. A different number of actual arguments is expected while calling ‘fprintf’ function. Expected: 3. Present: 4. generic.c 8

Note the format specifiers: “%s”. The string will be printed, but the ‘n’ variable won’t be used.

Miscellaneous:

Unfortunately (or maybe not) I won’t be able to show you anything else that could look like real errors. The analyzer issued some warnings which could be worth looking at, but they are not really serious. For example, the analyzer didn’t like some global variable names:

extern int Nspy, Nn, Nbt, Nt, Ne, Ns, Nstr, Nc, Nl;

PVS-Studio warning: V707 Giving short names to global variables is considered to be bad practice. It is suggested to rename ‘Nn’ variable. cfront.h 50

Another example: to print pointer values by means of fprintf() function Cfront uses the “%i” specificator. In the modern version of the language we have “%p”. But as far as I understand, there was no “%p” 30 years ago, and the code was totally correct.

Thought-provoking observations

This pointer

My attention was drawn by the fact that previously ‘this’ pointer was used in a different way. A couple of examples:

expr.expr(TOK ba, Pexpr a, Pexpr b)
{
  register Pexpr p;

  if (this) goto ret;
  ....
  this = p;
  ....
}

inline toknode.~toknode()
{
  next = free_toks;
  free_toks = this;
  this = 0;
}

As you see, it wasn’t forbidden to change ‘this’ value. Now it’s not only prohibited to change the pointer, but also to compare ‘this’ to null, as this comparison has completely lost any sense. (Still Comparing “this” Pointer to Null?)

This is the place for paranoia

I’ve also come across an interesting fragment. Nothing seems safe anymore. I liked this code fragment:

/* this is the place for paranoia */
if (this == 0) error('i',"0->Cdef.dcl(%d)",tbl);
if (base != CLASS) error('i',"Cdef.dcl(%d)",base);
if (cname == 0) error('i',"unNdC");
if (cname->tp != this) error('i',"badCdef");
if (tbl == 0) error('i',"Cdef.dcl(%n,0)",cname);
if (tbl->base != TABLE) error('i',"Cdef.dcl(%n,tbl=%d)",
                              cname,tbl->base);

Bjarne Stroustrup’s commentaries

  • Cfront was bootstrapped from Cpre, but it was a complete rewrite. There wasn’t a line of Cpre code in Cfront
  • The use-before-test-of-0 bad is of course bad, but curiously, the machine and OS i mostly used (DEC and research Unix) had page zero write protected, so that bug could not have been triggered without being caught.
  • The if-then-else bug (or not) is odd. I read the source, it’s not just misformatted, it’s incorrect; but curiously, that doesn’t matter: the only difference is a slight difference in the error message used before terminating. No wonder I did not spot it.
  • Yes, I should have used more readable names. I hadn’t counted on having other people maintain this program for years (and I’m a poor typist).
  • Yes, there were no %p then
  • Yes, the rules for “this” changed
  • The paranoia test is in the compiler’s main loop. My thought was that if anything when wrong with the software or hardware, one of those tests were likely to fail. At least once, it caught the effect of a bug in the code generator used to build Cfront. I think all significant programs should have a “paranoia test” against “impossible” errors.

Conclusion:

It’s really hard to estimate the significance of Cfront. It influenced the development of a whole sphere of programming, and gave this world an everlasting C++ language which continues developing. I am really grateful to Bjarne for all the work he has done in creating and developing C++. Thank you. In my turn, I was really glad to dig into the code of this wonderful compiler.

I thank all our readers for their attention, and wish you to have less bugs.

By Β Andrey Karpov, Bjarne Stroustrup

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s