Big Brother is helping you

Once more I got reassured that programmers write programs absolutely carelessly, so that their programs work not because of their skill but due to chance and care of Microsoft or Intel compiler developers. Right, it is them who really care and put crutches under our lop-sided programs when necessary.

Here is a byte-breaking story of the CString class and daughter of it, the Format function.


Pray, pray for compilers and their developers! They are taking so much effort to make our programs work despite many drawbacks and even errors. At the same time, their work is hard and invisible. They are noble knights of coding and guardian angels of us all.

I knew that Microsoft has a department responsible for providing maximum compatibility of new versions of operating systems with old applications. Their base contains more than 10000 most popular obsolete programs that must work in new versions of Windows. It is these efforts thanks to which I managed recently to play Heroes of Might and Magic II (a game of 1996) under 64-bit Windows Vista without problems. I think the game can be successfully launched under Windows 7 as well. Here are interesting notes by Alexey Pahunov on the topic (RU) of compatibility [1, 2, 3].

However, it seems that there are also other departments whose business is to help our horrible C/C++ code work and work on. But let me start this story from the very beginning.

I am involved in the development of the PVS-Studio tool intended for analysis of application source code. Friends, this is not an advertisement. I started speaking about it because we began collecting the most interesting type errors and learn to diagnose them.

Many errors are related to using ellipses in programs. Here is a theoretical reference:

There are functions in the definition of which it is impossible to specify the number and types of all the acceptable parameters. In this case the list of the formal parameters ends with an ellipsis (…), which means: and, perhaps, some more arguments”. For instance: int printf(const char* …);

One of such unpleasant yet easily diagnosed errors is passing an object of the class type instead of a pointer to a string into a function with a variable number of arguments. Here is an example of this error:

wchar_t buf[100];
std::wstring ws(L"12345");
swprintf(buf, L"%s", ws);

This code will cause total rubbish in the buffer or a program crash. Certainly, in a real program, the code will be more complicated, so please do not write comments on my post telling me that the GCC compiler will check the arguments and warn you unlike Visual C++. Strings might be passed from resources or other functions and you will fail to check anything. However, the diagnosis is simple in this case a class object is passed into a function of string formation and it causes an error.

The correct version of this code looks as follows:

wchar_t buf[100];
std::wstring ws(L"12345");
swprintf(buf, L"%s", ws.c_str());

Due to the fact, that you may pass anything into functions with a variable number of arguments, every book on C++ programming does not recommend using them. Instead of these, they suggest using safe mechanisms, for instance, boost::format. However, let these recommendations be, but there is a lot of code with various printfs, sprintfs, CString::Formats in the world and we will have to live with it for a long time. That is why we implemented a diagnostic rule to detect such dangerous constructs.

Let’s carry out theoretical investigations and see what is incorrect about the code given above. Well, it is incorrect for two reasons.

  1. The argument does not correspond to the defined format. Since we define %s, we must pass a pointer to the string. But in theory we may write our own sprintf function that will know that an object of the std::wstring class was passed to it and correctly print it. However, it is also impossible because of the second reason.
  2. Only a POD-type can be an argument for the ellipsis “…” while std::string is not a POD-type.

Theoretical reference on POD types:

POD is abbreviation of “Plain Old Data”. The following types refer to POD-types:

  1. all predefined arithmetic types (including wchar_t and bool);
  2. types defined with the enum key word;
  3. pointers;
  4. POD-structures (struct or class) and POD-unions which meet the following requirements:
    1. do not contain user constructors, destructors or copying assignment operator;
    2. do not have base classes;
    3. do not contain virtual functions;
    4. do not contain protected or private non-static data members;
    5. do not contain non-static data members of non-POD-types (or arrays of such types) and also references.

Correspondingly, the std::wstring class does not refer to POD-types since it has constructors, base class and so on.

If you pass an object, which is not a POD-type to an ellipsis, it causes unexpected behavior. Thus, at least theoretically, we cannot in any way correctly pass an object of the std::wstring type as an ellipsis argument.

The same thing must be with the Format function from the CString class. This is an incorrect version of the code:

CString s;
CString arg(L"OK");
s.Format(L"Test CString: %s\n", arg);

This is the correct version of the code:

s.Format(L"Test CString: %s\n", arg.GetString());

Or, as it is suggested in MSDN [4], we may use an explicit cast operator LPCTSTR implemented in the CString class to get a pointer to the string. Here is an example of correct code from MSDN:

CString kindOfFruit = "bananas";
int howmany = 25;
printf("You have %d %s\n", howmany, (LPCTSTR)kindOfFruit);

So, everything seems clear and transparent. It is also clear how to make a rule. We will detect misprints made when using functions with a variable number of arguments.

We did this, and I was shocked by the result. It turned out that most developers never think of these issues and write code like the following without any slightest doubt:

class CRuleDesc
  CString GetProtocol();
  CString GetSrcIp();
  CString GetDestIp();
  CString GetSrcPort();
  CString GetIpDesc(CString strIp);

CString CRuleDesc::GetRuleDesc()
  CString strDesc;
    _T("%s all network traffic from <br>%s "
       "on %s<br>to %s on %s <br>for the %s"),
    GetAction(), GetSrcIp(), GetSrcPort(),
    GetDestIp(), GetDestPort(), GetProtocol());
  return strDesc;

CString strText;
CString _strProcName(L"");
strText.Format(_T("%s"), _strProcName);


CString m_strDriverDosName;
CString m_strDriverName;
  _T("\\\\.\\%s"), m_strDriverName);


CString __stdcall GetResString(UINT dwStringID);
_stprintf(acBuf, _T("%s"),


// I think you understand
// that we may give you such examples endlessly.

Some developers do think but then forget. That is why the code like this looks so touching:

CString sAddr;
CString m_sName;
CString sTo = GetNick( hContact );

  sTo, (LPCTSTR)m_sName);

We collected so many examples like this in the projects with PVS-Studio that I cannot understand how it all can happen. Still everything works. I got reassured in it after writing a test program and trying various ways of using CString.

What is the reason? It seems to me that the compiler developers could not stand endless questions “Why Indian programs using CString do not work?” and accusations of the compiler being bad and unable to work with strings. So, they secretly held a sacred rite of exorcism by driving out evil from CString. They made an impossible thing possible – they implemented the CString class in such a crafty way that you may pass it to functions like printf and Format.

It was done quite intricately and those who want to know how read the source code of the CStringT class. I will not go into details and will stress only one important thing. Special implementation of CString is not enough, since passing of a non-POD-type theoretically causes unexpected behavior. So, Visual C++ developers together with Intel C++ developers made it so that the unexpected behavior is always a correct result πŸ™‚ For correct program operation can well be a subset of an unexpected behavior. πŸ™‚

I also start thinking about some strange things in the compilers behavior when it builds 64-bit programs. I suspect that the compilers developers deliberately make the programs behavior not theoretical but practical (i.e. efficient) in those simple cases when they recognize some pattern. The clearest example is a pattern of a loop. Here is an example of incorrect code:

size_t n = BigValue;
for (unsigned i = 0; i < n; i++) { ... }

Theoretically, if the value n > UINT_MAX is larger, an eternal loop must occur. But it does not occur in the Release version since a 64-bit register is used for the variable “i”. Of course, if the code is a bit more complicated, the eternal loop will occur but at least in some cases the program will be lucky. I wrote about this in the article “A 64-bit horse that can count” [6].

I used to think that this unexpectedly lucky behavior of a program is determined only by the specifics of optimization of Release versions. But now I am not sure about this. Perhaps, it is a conscious attempt to make an inefficient program work at least sometimes. Certainly, I do not know whether the cause lies in optimization or care of Big Brother, but it is a good occasion to philosophize, isnt it? πŸ™‚ Well, and the one who knows will hardly tell us. πŸ™‚

I am sure there are also other cases when the compiler stretches out its hand to cripple programs. If I encounter something interesting I will tell you.

May your code never glitch!

Author: Andrey Karpov


  1. Alexey Pahunov’s Russian blog. Backward compatibility is serious.
  2. Alexey Pahunov’s Russian blog. AppCompat.
  3. Alexey Pahunov’s Russian blog. Is Windows 3.x live?
  4. MSDN. CString Operations Relating to C-Style Strings. Topic: Using CString Objects with Variable Argument Functions.
  5. Andrey Karpov. A 64-bit horse that can count.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.