The most dangerous function in the C/C++ world

After checking hundreds of C/C++ projects of various types, I can claim: memset() is the most inefficient and dangerous function. Most errors that I see in projects are related to the usage of this particular memset() function. I understand that my conclusion is probably neither a revolutionary one, nor an extremely useful one, but I think our readers would be interested to find out why I have come to it.

image1 (28)

Examples

I’ve decided to enumerate some typical errors. Looking at them, I think you’ll agree that there is something wrong with memset() function. It kind of attracts evil.

To begin with, let’s brush up on how this function is declared:

void * memset ( void * ptr, int value, size_t num );

  • ptr – Pointer to the block of memory to fill.
  • value – Value to be set. The value is passed as an int, but the function fills the block of memory using the unsigned char conversion of this value.
  • num – Number of bytes to be set to the value. ‘size_t’ is an unsigned integral type.

Example N1 (ReactOS project)

void
Mapdesc::identify( REAL dest[MAXCOORDS][MAXCOORDS] )
{
  memset( dest, 0, sizeof( dest ) );
  for( int i=0; i != hcoords; i++ )
    dest[i][i] = 1.0;
}

This error occurred because in C and C++ you cannot pass arrays by value. The argument ‘dest’ is nothing more than an ordinary pointer. That’s why the sizeof() operator evaluates the size of the pointer, not the array.

At first glance, it has nothing to do with memset(). But on the other hand, this function will fill with zeroes only 4 or 8 bytes (exotic architectures don’t count). We really have a bug here, and it got there when the memset() function was called.

Example N2 (Wolfenstein 3D project)

typedef struct cvar_s {
  char *name;
  ...
  struct cvar_s *hashNext;
} cvar_t;

void Cvar_Restart_f( void ) {
  cvar_t  *var;
  ...
  memset( var, 0, sizeof( var ) );
  ...
}

A similar bug. It most likely occurred because of the carelessness of a programmer. ‘var’ variable is a pointer here, which means that memset() will zero out only a part of the structure. But in practice, only ‘name’ member will be zeroed out.

Example N3 (SMTP Client project)

void MD5::finalize () {
  ...
  uint1 buffer[64];
  ...
  // Zeroize sensitive information
  memset (buffer, 0, sizeof(*buffer));
  ...
}

A very common error pattern that only few programmers are aware of. The thing is that memset() function will be removed by the compiler. The buffer is no longer used after the memset() call. And the compiler removes the function call for the sake of optimization. In terms of C/C++ language it doesn’t have any impact on the program performance. The fact that the private information will remain in the memory, will not affect the operation of the program.

It is neither an error of the compiler, nor my imagination. The compiler really removes the memset() calls. And every time I write about this vulnerability error, I get e-mails from people who start arguing with me. I am quite tired of replying to such letters. Therefore, I ask those who are still in doubt to read these materials first before starting a new round of discussion.

Example N4 (Notepad++ project)

#define CONT_MAP_MAX 50
int _iContMap[CONT_MAP_MAX];
...
DockingManager::DockingManager()
{
  ...
  memset(_iContMap, -1, CONT_MAP_MAX);
  ...
}

It is often forgotten that the third argument of memset() function is not the number of elements, but the buffer size in bytes. This is exactly what happened in the code fragment given above. As a result, only one-quarter of the buffer will be filled (on condition that the size of the ‘int’ type is 4 bytes).

Example N5 (Newton Game Dynamics project)

dgCollisionCompoundBreakable::dgCollisionCompoundBreakable(....)
{
  ...
  dgInt32 faceOffsetHitogram[256];
  dgSubMesh* mainSegmenst[256];
  ...
  memset(faceOffsetHitogram, 0, sizeof(faceOffsetHitogram));
  memset(mainSegmenst, 0, sizeof(faceOffsetHitogram));
  ...
}

Here we definitely see a typo. Most likely someone was too lazy to do the memset() function call twice. The string was duplicated. In one fragment the ‘faceOffsetHitogram’ was replaced with ‘mainSegmenst’, but in the other case the programmer forgot to do it.

It turns out that sizeof() doesn’t evaluate the size of the array, filled with zeros. We may think – “What does it have in common with the memset() function?” But it is this function that will work incorrectly.

Example N6 (CxImage project)

static jpc_enc_tcmpt_t *tcmpt_create(....)
{
  ...
  memset(tcmpt->stepsizes, 0,
    sizeof(tcmpt->numstepsizes * sizeof(uint_fast16_t)));
  ...
}

There is an extra sizeof() operator. It would be correct to evaluate in such a way:

tcmpt->numstepsizes * sizeof(uint_fast16_t)

But instead of it we had an additional sizeof() and some rubbish as a result.

sizeof(tcmpt->numstepsizes * sizeof(uint_fast16_t))

Here the sizeof() operator evaluates the size of the size_t type. Exactly this expression has exactly this type.

I know that you probably want to make an objection. It is not the first time that the error is related to the sizeof () operator, i.e. the programmer makes an error evaluating the buffer size. However, the cause of these errors is still memset() function. It works in such a way that doing these evaluations you can easily make an error.

Example N7 (project WinSCP)

TForm * __fastcall TMessageForm::Create(....)
{
  ....
  LOGFONT AFont;
  ....   
  memset(&AFont, sizeof(AFont), 0);
  ....
}

Memset() function absorbs everything. That’s why it is all right if you confuse the 2nd and the 3rd argument. This is exactly what happened here. This function fills 0 bytes.

Example N8 (Multi Theft Auto project)

Here is another similar error. Win32 API developers were joking when they were writing such a macro:

#define RtlFillMemory(Destination,Length,Fill) \
  memset((Destination),(Fill),(Length))

According to the meaning, it’s like an alternative to the memset(). But you have to be careful. Note that the 2nd and 3rd argument change their places.

Sometimes when people start using RtlFillMemory(), they treat it as memset(), and think that they have the same parameters. But as a result they get more bugs.

#define FillMemory RtlFillMemory
LPCTSTR __stdcall GetFaultReason ( EXCEPTION_POINTERS * pExPtrs )
{
  ....
  PIMAGEHLP_SYMBOL pSym = (PIMAGEHLP_SYMBOL)&g_stSymbol ;
  FillMemory ( pSym , NULL , SYM_BUFF_SIZE ) ;
  ....
}

NULL is nothing but a 0. That’s why the memset() function filled 0 bytes.

Example N9 (IPP Samples project)

I think you understand that I can provide a large list of the errors we’ve found. However, it won’t be very interesting, because it’s boring to look at the same errors, most of which you’ve already heard about. But let’s look at just one more case.

Although some of the errors given above were found in the C++ code, they don’t have anything to do with C++. In other words, these programming errors are related to the C language style.

The following error is connected with improper use of the memset() in a C++ program. The example is quite a long one, so you don’t have to look too thoroughly at it. Read the description below and everything will become clear.

class _MediaDataEx {
  ...
  virtual bool TryStrongCasting(
    pDynamicCastFunction pCandidateFunction) const;
  virtual bool TryWeakCasting(
    pDynamicCastFunction pCandidateFunction) const;
};

Status VC1Splitter::Init(SplitterParams& rInit)
{
  MediaDataEx::_MediaDataEx *m_stCodes;
  ...
  m_stCodes = (MediaDataEx::_MediaDataEx *)
    ippsMalloc_8u(START_CODE_NUMBER*2*sizeof(Ipp32s)+
                  sizeof(MediaDataEx::_MediaDataEx));
  ...
  memset(m_stCodes, 0, 
    (START_CODE_NUMBER*2*sizeof(Ipp32s)+
    sizeof(MediaDataEx::_MediaDataEx)));
  ...
}

Memset() function is used to initialize an array consisting of class objects. The biggest trouble is that the class has virtual functions. Thereafter, the memset() function zeroes out not only the class fields, but the pointer to the virtual methods chart (vptr) as well. What it will lead to is a good question, but there is nothing positive in coding in such a way. It’s no good using the classes like this.

Conclusion

As you can see, the memset() function has an extremely tricky interface. This function provokes way more bugs than the other ones. Be careful!

Note

Related article “memset is Evil“.

By Andrey Karpov

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s