Every developer knows about a debugger, version control system or, for example, unit-tests. Nevertheless, not all developers are familiar with the methodology of static code analysis. Meanwhile, the technology becomes an integral part of the development cycle of programs. I would like to suggest a small introductory course for anyone interested in modern development trends.
The earlier an error is detected, the less expensive it is to correct it. That is why, for example, a TDD (test-driven development) methodology has emerged, in which the tests are written before the implementation of regular functions.
Another methodology that allows detecting an error at the early stage is code review. Roughly speaking, after writing code, a developer shows it to his colleague and he checks it. Of course, this is an out-of-date approach. A high-grade code review is the entire process, which is well described, for example, in the book of S. McConnell “Code Complete”. By the way, anyone who calls himself a good developer simply must read this book.
Here comes the moment when the code review methodology starts letting us down. More truly, the methodology still works well, but it is becoming more expensive. Actually, can you remember, when was the last time your fresh functions were honestly read by the group of four developers who gave recommendations, and the group was gathered again to see the modified code? Have you actually reviewed code in such a way at least once?
The reason of the high cost is in the growth of the amount of code and the exponential growth of the complexity of its analysis by a person. The fact of the matter is that with the growth of the project the complexity and the number of errors grows non-linearly.
20-30 years ago, you could just take and check all code on reviews, but now it is unacceptably difficult and expensive. Just to clarify, let me give you two numbers:
- The number of lines of code in the first C++ compiler (Cfront 1.0): 85 KLOC
- The number of lines of code in the modern Clang compiler (excluding LLVM): 1700 KLOC
Here static code analysis came to the aid. The point is that not a man, but the program performs code review. Of course, a program will verify code worse than 4 pairs of attentive eyes. Unfortunately, these attentive eyes are not always available. There is a lack of time and developers to read the code. Therefore, the use of static analyzers is a very justifiable alternative.
Yes, many bugs cannot be found by static analyzers. These are complicated expert systems, but not an artificial intelligence. At the same time, they do not get tired; they always have time to check your code. Moreover, a huge amount of knowledge about erroneous patterns is laid in them, and they are able to detect an error, about which a developer may not know. Therefore, sometimes static analysis even exceeds the developer.
Let me give an example for C++ developers. Let we have a loop, organized using iterators. If a container is changed in it, the iterators will be invalid. This is an error: undefined behavior of the program occurs. It is written in books on programming. The new standard of C++ language gave developers a range-based for loop. It is a less familiar construction, and as a result, there is a big chance that such code will be written as follows:
We must not write such code. To implement a range-based for loop a compiler uses all the same iterators. They are just hidden from our sight. After removing items from the container, iterators are becoming invalid.
Static code analyzer knows about such an error pattern and detects it. The code fragment given above, was detected using PVS-Studio analyzer in ClickHouse DBMS code, which Yandex company develops to meet the challenges of Yandex.Metrica. You can read more about it in the article – https://pvs-studio.com/en/blog/posts/cpp/0529/.
There is a variety of static code analyzers for various programming languages. Google will help you find and get to know them. I suggest paying attention to a popular tool – PVS-Studio. This is a powerful static analyzer for bug detection and a search for potential vulnerabilities in C, C++, and C# code. It works in Windows and Linux. It is paid, but there are options of free licensing.
The tool is great at finding null dereferences, undefined behavior, 64-bit errors, and so on. In addition, it is particularly good at searching for various typos and erroneous copy-paste. It is usually said about such errors that you can find them for 5 seconds. Therefore, the authors even created a resource for trolling such guys.
Critics are encouraged to find errors (which PVS-Studio finds) not for 5, but for the entire 60 seconds. You can try out yourself: https://pvs-studio.com/en/blog/posts/0280/. I’d like to warn at once, that the test doesn’t work appropriately on mobiles and requires using of a computer mouse.
Static analysis does not compete with other methods of dealing with errors, but complements them. Static analysis tools report may resemble compiler warnings, but it is implemented at a high qualitative level. This power is worth paying for. Here is the analogy: there are Paint and Gimp, but Photoshop and CorelDRAW are in great demand. Specialized utilities do not just perform a deep code analysis, but also provide many support mechanisms for working with warnings.
For those who want to grasp more, Google ‘static code analysis’ query will show the right way.