Avoid using multiple small #ifdef blocks

The fragment is taken from CoreCLR project. The error is detected by the following diagnostic: V522 Dereferencing of the null pointer ‘hp’ might take place.

heap_segment* gc_heap::get_segment_for_loh (size_t size
#ifdef MULTIPLE_HEAPS
                                           , gc_heap* hp
#endif //MULTIPLE_HEAPS
                                           )
{
#ifndef MULTIPLE_HEAPS
    gc_heap* hp = 0;
#endif //MULTIPLE_HEAPS
    heap_segment* res = hp->get_segment (size, TRUE);
    if (res != 0)
    {
#ifdef MULTIPLE_HEAPS
        heap_segment_heap (res) = hp;
#endif //MULTIPLE_HEAPS
  ....
}

Explanation

We believe that #ifdef/#endif constructs are evil – an unavoidable evil, unfortunately. They are necessary and we have to use them. So we won’t urge you to stop using #ifdef, there’s no point in that. But we do want to ask you to be careful to not “overuse” it.

скачанные файлы (1)

Perhaps, many of you have seen code literally stuffed with #ifdefs. It’s especially painful to deal with code where #ifdef is repeated every ten lines, or even more often. Such code is usually system-dependent, and you can’t do without using #ifdef in it. That doesn’t make you any happier, though.

See how difficult it is to read the code sample above! And it is code reading which programmers have to do as their basic activity. Yes, we do mean it. We spend much more time reviewing and studying existing code than writing new one. That’s why code which is hard to read reduces our efficiency so much, and leaves more chance for new errors to sneak in.

Getting back to our code fragment, the error is found in the null pointer dereferencing operation, and occurs when the MULTIPLE_HEAPS macro is not declared. To make it easier for you, let’s expand the macros:

heap_segment* gc_heap::get_segment_for_loh (size_t size)
{
  gc_heap* hp = 0;
  heap_segment* res = hp->get_segment (size, TRUE);
  ....

The programmer declared the hp variable, initialized it to NULL, and dereferenced it right off. If MULTIPLE_HEAPS hasn’t been defined, we’ll get into trouble.

Correct code

This error is still living in CoreCLR (12.04.2016) despite a colleague of mine having reported it in the article “25 Suspicious Code Fragments in CoreCLR“, so we aren’t sure how best to fix this error.

Since (hp == nullptr), then the ‘res’ variable should be initialized to some other value, too – but we don’t know what value exactly. So we’ll have to do without the fix this time.

Recommendations

Eliminate small #ifdef/#endif blocks from your code – they make it really hard to read and understand! Code with “woods” of #ifdefs is harder to maintain and more prone to mistakes.

There is no recommendation to suit every possible case – it all depends on the particular situation. Anyway, just remember that #ifdef is a source of trouble, so you must always strive to keep your code as clear as possible.

Tip N1. Try refusing #ifdef.

#ifdef can be sometimes replaced with constants and the usual if operator. Compare the following 2 code fragments: A variant with macros:

#define DO 1

#ifdef DO
static void foo1()
{
  zzz();
}
#endif //DO

void F()
{
#ifdef DO
  foo1();
#endif // DO
  foo2();
}

This code is hard to read; you don’t even feel like doing it. Bet you’ve skipped it, haven’t you? Now compare it to the following:

const bool DO = true;

static void foo1()
{
  if (!DO)
    return;
  zzz();
}

void F()
{
  foo1();
  foo2();
}

It’s much easier to read now. Some may argue the code has become less efficient since there is now a function call and a check in it. But we don’t agree with that. First, modern compilers are pretty smart and you are very likely to get the same code without any extra checks and function calls in the release version. Second, the potential performance losses are too small to be bothered about. Neat and clear code is more important.

Tip N2. Make your #ifdef blocks larger.

If we were to write the get_segment_for_loh() function, we wouldn’t use a number of #ifdefs there; we’d make two versions of the function instead. True, there’d be a bit more text then, but the functions would be easier to read, and edit too.

Again, some may argue that it’s duplicated code, and since they have lots of lengthy functions with #ifdef in each, having two versions of each function may cause them to forget about one of the versions when fixing something in the other.

Hey, wait! And why are your functions lengthy? Single out the general logic into separate auxiliary functions – then both of your function versions will become shorter, ensuring that you will easily spot any differences between them.

We know this tip is not a cure-all. But do think about it.

Tip N3. Consider using templates – they might help.

Tip N4. Take your time and think it over before using #ifdef. Maybe you can do without it? Or maybe you can do with fewer #ifdefs, and keep this “evil” in one place?

Written by Andrey Karpov.

This error was found with PVS-Studio static analysis tool.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s