If there’s one thing C is known and (in)famous for, it’s the ease of shooting yourself in the foot with it. And there’s indeed no denying that the freedom C offers comes with the price of making it our own responsibility to tame and keep the language under control. On the bright side, since the language’s flaws are so well known, we have a wide selection of tools available that help us to eliminate the most common problems and blunders that could come back to bite us further down the road. The catch is, we have to really want it ourselves, and actively listen to what the tools have to say.
We often look at this from a security point of view and focus on exploitable vulnerabilities, which you may not see as valid threat or something you need to worry about in your project. And you are probably right with that, not every flaw in your code will lead to attackers taking over your network or burning down your house, the far more likely consequences are a lot more mundane and boring. But that doesn’t mean you shouldn’t care about them.
Buggy, unreliable software is the number one cause for violence against computers, and whether you like it or not, people will judge you by your code quality. Just because Linus Torvalds wants to get off Santa’s naughty list, doesn’t mean the technical field will suddenly become less critical or loses its hostility, and in a time where it’s never been easier to share your work with the world, reliable, high quality code will prevail and make you stand out from the masses.
To be fair, it’s a different story with some quick-and-dirty hacks, or one-off proof of concept projects put together over the weekend. We can easily get away with a lot more in those cases. But once we step out of the initial prototype phase and aim for a longer-term development with growing complexity, we shouldn’t have to worry about the fundamental, easy preventable quirks and pitfalls of either the language or programming itself. Static code analysis can help us find and counter them, and again, since it’s such a common problem, not only with C, we have a large variety of tools at our disposal.
The one tool that gets often overlooked here, however, is the compiler itself, in particular its warnings. With open source in mind, let’s see how we can utilize gcc
and clang
to increase our code quality by listening to what they have to say about our code.
Compiler Warnings
Unlike compiler errors, which indicate actual language violations that prevent the compiler from fully processing the source code, warnings indicate that something doesn’t seem quite right with our code even though it may be syntactically correct. Compiler warnings are like a good friend eager to give us well-meant advice. Sometimes they warn of a deeper problem, but sometimes they’re superficial. We have essentially three options to deal with warnings: ignore them, hide them, or fix the actual issue in place.
Based on the introduction to this article, it may seem like this won’t be about the first two options, but the reality is a lot less black-and-white. With C running on anything from tiny microcontrollers to large server farms, not every warning is relevant or applicable to our situation, and while focusing on the right warnings can save us from misery, others might just waste our time.
Compiler Flags For Warnings
By default, compilers may not be too talkative about warnings, and mostly worry about immediate code problems such as division by zero or suspicious pointer conversions. It’s up to us to enable more warning options that we pass as flags to the compiler, and both gcc
and clang
offer a long (and mostly compatible) list of such flags.
To enable a given flag, we pass -Wflag
, and to disable a warning, we pass -Wno-flag
. If we are particularly serious about a warning, -Werror=flag
turns a warning into an error, forcing the compiler to abort its job. Enabling every possible flag on its own seems like tedious work — and it is. The good news is that both gcc
and clang
offer additional meta flags -Wall
and -Wextra
that combine two separate sets of common warning flags. The bad news is, these are misleading names. gcc -Wall
is far from enabling all warnings, and even gcc -Wall -Wextra
still leaves out some of them. The same is true for clang
, but it offers an additional -Weverything
flag to fully warn about everything, giving us both an opt-in and opt-out approach.
But enough with theory, time to look at some bad code:
#include <stdio.h> enum count {NONE, ONE, OTHER}; int main(int argc, char **argv) { int ret; enum count count = argc - 1; // [1] signedness silently changed* switch (count) { case NONE: printf("%sn", argc); // [2] integer printed as string ret = 1; // [3] missing break case ONE: printf(argv[1]); // [4] format string is not a string literal ret = 2; break; } // [5] enum value OTHER not considered // [6] no default case return ret; // [7] possibly uninitialized value }
Style and taste aside, depending on your own definition and philosophy, we could address around seven issues in these few lines, with passing an int
parameter to a %s
format possibly being the most obvious one. Let’s take a closer look.
Some of these issues may not cause any harm in our specific case, for example turning a signed integer into an unsigned one ([1]
), but in other scenarios it might. Printing an integer when a string is expected on the other hand is undefined behavior and most likely ends in a segfault ([2]
). The missing break
statement ([3]
) will continue to pass argv[1]
to printf()
also when there’s no value set for it. Looking back at pointers to pointers arrangements, argv[argc]
is NULL
, causing another undefined behavior here. Accepting unsanitized user input is a bad idea on its own, passing it to printf()
opens the door to format string attacks ([4]
). Not handling every enum
value ([5]
) and leaving out a default case in the switch
([6]
) results in ret
being returned uninitialized when we have more than one command line argument ([7]
).
By default, on x86_64 clang
7.0.0 warns us about three of them ([2]
, [4]
, and [5]
), while gcc
8.2.1 waves us happily through without any warning whatsoever. We clearly have to step up our warning flag game here. Let’s see how the meta flags -Wall
, -Wextra
, and clang
‘s -Weverything
handle it. Note that -Wextra
rarely makes sense on its own and is better considered as an addition to -Wall
.
These are some sobering results. Unfortunately, clang
doesn’t care about implicit fall through (i.e missing break
inside a switch
) in C, only when compiling C++ code, and it generally prefers complete enum
handling over default cases. With gcc
on the other hand, it’s obvious we need to take a closer look in the manual. Here’s how we can get almost all warnings:
$ gcc -Wall -Wextra -Wformat-security -Wswitch-default
Well, now that we finally have the warnings printed, it is time to get rid of them again.
Fixing your Warnings
After eliminating all the warnings, our main()
function could look like this:
int main(int argc, char **argv) { int ret = -1; enum count count = (enum count) (argc - 1); switch (count) { case NONE: printf("%dn", argc); ret = 1; break; case ONE: printf("%sn", argv[1]); ret = 2; break; case OTHER: // do nothing break; } return ret; }
Careful though, assigning an integer unchecked to an enum
is usually not the best idea, even more so when it’s argc
. Also, in this example, the missing break
in the switch
was a real mistake, but in other cases you might want to purposely fall through a case
. Replacing the break
with a /* fall through */
comment (or similar spelling variations) will keep gcc
happy — and will remind you and fellow developers that this is intentionally.
Sometimes it’s a bit trickier to get rid of the warnings though. Take the following example that involves a function pointer:
// callback function pointer void (*some_callback)(int, void *); // callback implementation void dummy_callback(int i, void *ptr) { // do something with i but not with ptr }
With -Wall -Wextra
, the compiler will complain that the ptr
parameter is unused. But what do you do if you simply have no use for it, yet the callback declaration insists on having it? Well, you can either cast the parameter to void
, or dive into the world of compiler attributes. Here’s what either one looks like:
void dummy_callback(int i, void *ptr) { (void) ptr; // do something with i but not with ptr } void dummy_callback(int i, void *ptr __attribute__((unused))) { // do something with i but not with ptr }
Which one you choose is up to you, the compiler will generate the same code each time. But let’s stick with the attributes for now.
Compiler Attributes
We just saw that we can use special compiler attributes to individually suppress warnings without disabling them completely. At the risk of sounding indecisive, we can also use attributes to get new warnings from the compiler. Yes, we just got rid of them, so why would we want to get new ones? Simple: once we have a warning culture established, we can use it to inform other developers (including our future selves) how we want certain functions or data treated.
Let’s say we write a module for a bigger software project where one specific function returns a value that decides the fate of the whole module. We’d want to make sure that whoever calls that function won’t get away with ignoring that value. Using __attribute__((warn_unused_result))
, we can let the compiler be the messenger.
int specific_function(void) __attribute__((warn_unused_result)); ... specific_function(); // compiler warns about unused return value (void) specific_function(); // nice try, still a warning though int ret = specific_function(); // now we're good
In a similar way, we can use two attributes to control our pointers, especially when it comes to possible NULL
pointers, and generate even more warnings.
// declare that the first parameter mustn't be NULL void some_function(char *, int) __attribute__((nonnull(1))); // declare that return value can't be NULL char *another_function(void) __attribute__((returns_nonnull));
While this allows some compiler optimization, it also defines a contract for the developer: NULL
is not a valid option for this parameter, and you don’t have to worry about NULL
being returned.
some_function(NULL, 0xff); // warning that parameter is NULL if (another_function() == NULL) { // warning that check is always false }
At the same time, if we actually returned NULL
in another_function()
, we’d get a warning about it, too. Not that we can really enforce anything with it, in the end they are just warnings without any consequences.
Adding Consequences
If you are really serious about warnings, you can decide to turn all of them into errors with the -Werror
flag. You might want to reconsider that though, not every warning is fatal and needs to be addressed immediately. Sometimes you just want to get your idea out of your head and into the editor, and see how that turns out, and leave the clean-up for later. A useful approach is to separate your build environment to leave warnings as warnings during development, but once you build a release or merge to master, tighten down the rules.
Whether it really has to be -Werror
is for you to decide, and there’s also the option to turn only individual warnings into errors. Let’s say we want to be strict about our NULL
related attributes: -Wnonnull
will enable the warnings, -Werror=nonnull
will enable the warnings and treat them as errors. Note that -Werror=flag
implicitly sets -Wflag
, so we don’t need to worry whether a warning will be enabled or not, as soon as we turn them into errors, they’ll be there.
Where to go from here
Unfortunately there are some shortcomings in our last examples. While the compiler will detect the NULL
scenarios demonstrated above, they are easily circumvented, whether purposely or by accident:
int ret = specific_function(); // return check ends here // whether you actually use its value doesn't matter char *ptr = NULL; some_function(ptr, 0xff); // only explicit NULL parameters raise warnings ptr = another_function(); if (ptr == NULL) { // no more warning that it will always be false ... }
Does that make listening to warnings futile? Absolutely not, and we almost always end up with better code if we actively eliminate them. However, don’t let the absence of warnings give you a false sense of security, either. Compiler warnings are our first line of defense, and they can help us to improve our code quality, but they are no magic bullet. Making your code stable and reliable takes more than one tool. But then again, we have plenty more available for C, and next time we will have a look at more static code analysis using lint
and its friends.