Don’t do the compiler’s job

Consider the code fragment, taken from MySQL project. The code contains an error that analyzer diagnoses in the following way: V525. The code containing the collection of similar blocks. Check items ‘0’, ‘1’, ‘2’, ‘3’, ‘4’, ‘1’, ‘6’ in lines 680, 682, 684, 689, 691, 693, 695.

static int rr_cmp(uchar *a,uchar *b)
{
  if (a[0] != b[0])
    return (int) a[0] - (int) b[0];
  if (a[1] != b[1])
    return (int) a[1] - (int) b[1];
  if (a[2] != b[2])
    return (int) a[2] - (int) b[2];
  if (a[3] != b[3])
    return (int) a[3] - (int) b[3];
  if (a[4] != b[4])
    return (int) a[4] - (int) b[4];
  if (a[5] != b[5])
    return (int) a[1] - (int) b[5];     <<<<====
  if (a[6] != b[6])
    return (int) a[6] - (int) b[6];
  return (int) a[7] - (int) b[7];
}

Explanation

This is a classic error, related to copying fragments of code (Copy-Paste). Apparently, the programmer copied the block of code “if (a[1] != b[1]) return (int) a[1] – (int) b[1];”. Then he started changing the indices and forgot to replace “1” with “5”. This resulted in the comparison function occasionally returning an incorrect value; this issue is going to be difficult to notice. And it’s really hard to detect since all the tests had not revealed it before we scanned MySQL with PVS-Studio.

когда ищешь ошибку в коде.gif

Correct code

if (a[5] != b[5])
  return (int) a[5] - (int) b[5];

Recommendation

Although the code is neat and easy-to-read, it didn’t prevent the developers from overlooking the error. You can’t stay focused when reading code like this because all you see is just similar looking blocks, and it’s hard to concentrate the whole time.

These similar blocks are most likely a result of the programmer’s desire to optimize the code as much as possible. He “unrolled the loop” manually. It wasn’t a good idea in this case.

Firstly, It’s hard to say that the programmer really achieved anything with it. Modern compilers are pretty smart, and are very good at automatic loop unrolling if it can help improve program performance.

Secondly, the bug appeared in the code because of this attempt to optimize the code. If you write a simpler loop, there will be less chance of making a mistake.

I’d recommend rewriting this function in the following way:

static int rr_cmp(uchar *a,uchar *b)
{
  for (size_t i = 0; i < 7; ++i)
  {
    if (a[i] != b[i])
      return a[i] - b[i]; 
  }
  return a[7] - b[7];
}

Advantages:

  • The function is easier to read and comprehend.
  • You are much less likely to make a mistake writing it.

I am quite sure that this function will work no slower than its longer version.

So, my advice would be – write simple and understandable code. As a rule, simple code is usually correct code. Don’t try to do the compiler’s job – unroll loops, for example. The compiler will most definitely do it well without your help. Doing such fine manual optimization work would only make sense in some particularly critical code fragments, and only after the profiler has already estimated those fragments as problematic.

Written by Andrey Karpov.

This error was found with PVS-Studio static analysis tool.

One thought on “Don’t do the compiler’s job

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s