Vault7: CIA Hacking Tools Revealed
Navigation: » Latest version
Owner: User #26968069
C Coding Conventions
('toc' missing)
Introduction
This style guide is intended to help standardize the code generated by the various teams around the guidelines used throughout Google and the general development community. Code of a consistent format leads to greater legibility.
When extending an existing project that does not already follow this guide, strive for consistency. Speak the local dialect. Aim for a uniformity in approach, and make the job of reading your code easier for the next person (it may be future you).
This guide is not intended to be a book to be thrown at others during code review. Instead, if there is a question about how something should be formatted, this guide will be the objective reference on what code "should" look like. Much of the text here is from the Google C++ Style Guide. It has been adapted to suit our needs and preferences. This guide is not holy canon. When in doubt, do what makes the most sense for your particular project or use case.
Header Files
In general, every .c file should have an associated .h file. There are some common exceptions, such as unit tests and small .c files containing just a main() function.
Correct use of header files can make a huge difference to the readability, size and performance of your code.
The following rules will guide you through the various pitfalls of using header files.
Self-contained Headers
Header files should be self-contained and end in .h. Files that are meant for textual inclusion, but are not headers, should end in .inc. Separate -inl.h headers are disallowed.
All header files should be self-contained. In other words, users and refactoring tools should not have to adhere to special conditions in order to include the header. Specifically, a header should have header guards, should include all other headers it needs, and should not require any particular symbols to be defined.
There are rare cases where a file is not meant to be self-contained, but instead is meant to be textually included at a specific point in the code. Examples are files that need to be included multiple times or platform-specific extensions that essentially are part of other headers. Such files should use the file extension .inc.
If a template or inline function is declared in a .h file, define it in that same file. The definitions of these constructs must be included into every .c file that uses them, or the program may fail to link in some build configurations. Do not move these definitions to separate -inl.h files.
The #define Guard
All header files should have #define guards to prevent multiple inclusion. The format of the symbol name should be _{}<FILE>{_}_H.
To guarantee uniqueness, they should be based on the full path in a project's source tree. For example, the file foo/src/bar/baz.h in project foo should have the following guard:
#ifndef __BAZ_H
#define __BAZ_H
...
#endif // __BAZ_H
Forward Declarations
You should #include files for their function declarations rather than attempting to avoid an include by copy-pasting a forward declaration.
Definition
A "forward declaration" is a declaration of a class, function, or template without an associated definition. When using a function declared in a header file, always #include that header.
Pros
- Avoids conflicts when code is refactored or prototypes are changed.
- It can be difficult to determine whether a forward declaration or a full #include is needed for a given piece of code, particularly when implicit conversion operations are involved. In extreme cases, replacing an #include with a forward declaration can silently change the meaning of code.
Cons
- Unnecessary #includes force the compiler to open more files and process more input.
- They can also force your code to be recompiled more often, due to changes in the header.
Please see Names and Order of Includes for rules about when to #include a header.
Inline Functions
Define functions inline only when they are small, say, 10 lines or less.
Definition
You can declare functions in a way that allows the compiler to expand them inline rather than calling them through the usual function call mechanism.
Pros
Inlining a function can generate more efficient object code, as long as the inlined function is small. Feel free to inline accessors and mutators, and other short, performance-critical functions.
Cons
Overuse of inlining can actually make programs slower. Depending on a function's size, inlining it can cause the code size to increase or decrease. Inlining a very small accessor function will usually decrease code size while inlining a very large function can dramatically increase code size. On modern processors smaller code usually runs faster due to better use of the instruction cache.
Decision
A decent rule of thumb is to not inline a function if it is more than 10 lines long.
Another useful rule of thumb: it's typically not cost effective to inline functions with loops or switch statements (unless, in the common case, the loop or switch statement is never executed).
It is important to know that functions are not always inlined even if they are declared as such; for example, virtual and recursive functions are not normally inlined. Usually recursive functions should not be inline.
Function Parameter Ordering
When defining a function, parameter order is: inputs, then outputs.
Parameters to C/C++ functions are either input to the function, output from the function, or both. Input parameters are usually values or const references, while output and input/output parameters will be non-const pointers. When ordering function parameters, put all input-only parameters before any output parameters. In particular, do not add new parameters to the end of the function just because they are new; place new input-only parameters before the output parameters.
This is not a hard-and-fast rule. Parameters that are both input and output (often classes/structs) muddy the waters, and, as always, consistency with related functions may require you to bend the rule.
Names and Order of Includes
Use standard order for readability and to avoid hidden dependencies: Related header, C library, C++ library, other libraries' .h, your project's .h.
All of a project's header files should be listed as descendants of the project's source directory without use of UNIXOperating system directory shortcuts . (the current directory) or .. (the parent directory). For example, awesome-project/src/base/logging.h should be included as:
#include "base/logging.h"
In dir/foo.c or dir/foo_test.c, whose main purpose is to implement or test the stuff in dir2/foo2.h, order your includes as follows:
- dir2/foo2.h.
- C system files.
- Other libraries' .h files.
- Your project's .h files.
With the preferred ordering, if dir2/foo2.h omits any necessary includes, the build of dir/foo.c or dir/foo_test.c will break. Thus, this rule ensures that build breaks show up first for the people working on these files, not for innocent people in other packages.
dir/foo.c and dir2/foo2.h are usually in the same directory (e.g. base/basictypes_test.c and base/basictypes.h), but may sometimes be in different directories too.
Within each section the includes should be ordered alphabetically. Note that older code might not conform to this rule and should be fixed when convenient.
You should include all the headers that define the symbols you rely upon (except in cases of forward declaration). If you rely on symbols from bar.h, don't count on the fact that you included foo.h which (currently) includes bar.h: include bar.h yourself, unless foo.h explicitly demonstrates its intent to provide you the symbols of bar.h. However, any includes present in the related header do not need to be included again in the related cc (i.e., foo.c can rely on foo.h's includes).
For example, the includes in awesome-project/src/foo/internal/fooserver.c might look like this:
#include "foo/server/fooserver.h"
#include <sys/types.h>
#include <unistd.h>
#include "base/basictypes.h"
#include "base/commandlineflags.h"
#include "foo/server/bar.h"
Exception
Sometimes, system-specific code needs conditional includes. Such code can put conditional includes after other includes. Of course, keep your system-specific code small and localized. Example:
#include "foo/public/fooserver.h"
#include "base/port.h" // For LANG_CXX11.
#ifdef LANG_CXX11
#include <initializer_list>
#endif // LANG_CXX11
Scoping
Local Variables
Declare a function's variables at the function scope, and initialize variables in the declaration.
In particular, initialization should be used instead of declaration and assignment, e.g.:
int i;
i = MAGIC_CONSTANT; // Bad – initialization separate from declaration.
int j = 0; // Good – declaration has initialization.
char *v = NULL; // Good – v starts initialized.
Static and Global Variables
Avoid global variables whenever possible. When globals are necessary, give preference to a single structure with multiple members in lieu of many individual variables.
Have a clear, single point of initialization and destruction.
Do not rely on program exit to implicitly free memory or otherwise clean up after global or static variables.
Miscellaneous
Integer Types
When declaring a variable, use a precise-width type (e.g. int16_t). If your variable represents a value that could ever be greater than or equal to 2^31 (2GiB), use a 64-bit type such as int64_t. Keep in mind that even if your value won't ever be too large for an int, it may be used in intermediate calculations which may require a larger type. When in doubt, choose a larger type.
Definition
C does not specify the sizes of its integer types. Typically people assume that short is 16 bits, int is 32 bits, long is 32 bits and long long is 64 bits.
Pros
- Uniformity of declaration.
- Clarity of intent and expected data size
Cons
- The sizes of integral types in C can vary based on compiler and architecture.
Decision
<stdint.h> defines types like int16_t, uint32_t, int64_t, etc. You should always use those in preference to short, unsigned long long and the like, when you need a guarantee on the size of an integer. Of the C integer types, only int should be used. When appropriate, you are welcome to use standard types like size_t and ptrdiff_t. If your platform defines unsigned types with a prefix "u_" create a typedef for each type with just "u" (no underscore).
We use int very often, for integers we know are not going to be too big, e.g., loop counters. You may use plain old int for such things. You should assume that an int is at least 32 bits, but don't assume that it has more than 32 bits. If you need a 64-bit integer type, use int64_t or uint64_t.
For integers we know can be "big", use int64_t.
You should use unsigned integer types such as uint32_t, when the value should never be negative.
If your code is a container that returns a size, be sure to use a type that will accommodate any possible usage of your container. When in doubt, use a larger type rather than a smaller type.
Use care when converting integer types. Integer conversions and promotions can cause non-intuitive behavior.
64-bit Portability
Code should be 64-bit and 32-bit friendly. Bear in mind problems of printing, comparisons, and structure alignment.
- printf() specifiers for some types are not cleanly portable between 32-bit and 64-bit systems. C99 defines some portable format specifiers. Unfortunately, MSVC 7.1 does not understand some of these specifiers and the standard is missing a few.
- Remember that sizeof(void *) != sizeof(int)
- Use the LL or ULL suffixes as needed to create 64-bit constants
- If you really need different code on 32-bit and 64-bit systems, use #ifdef _LP64 to choose between the code variants. (But please avoid this if possible, and keep any such changes localized
Preprocessor Macros
Be very cautious with macros. Prefer inline functions, enums, and const variables to macros.
Macros mean that the code you see is not the same as the code the compiler sees. This can introduce unexpected behavior, especially since macros have global scope.
Instead of using a macro to inline performance-critical code, use an inline function. Do not use a macro to "abbreviate" a long variable name. Instead of using a macro to conditionally compile code ... well, don't do that at all (except, of course, for the #define guards to prevent double inclusion of header files). It makes testing much more difficult.
Macros can do things these other techniques cannot, and you do see them in the codebase, especially in the lower-level libraries. And some of their special features (like stringifying, concatenation, and so forth) are not available through the language proper. But before using a macro, consider carefully whether there's a non-macro way to achieve the same result.
The following usage pattern will avoid many problems with macros; if you use macros, follow it whenever possible:
- Be cautious when defining macros in a .h file.
- #define macros in a more local scope, when practical. For highly localized macros, #undef them right after.
- Do not just #undef an existing macro before replacing it with your own; instead, pick a name that's likely to be unique.
- Avoid using ## to generate function/member/variable names.
0 and NULL
Use 0 for integers, 0.0 for reals, NULL for pointers, and '\0' for chars.
sizeof
Prefer sizeof(varname) to sizeof(type).
Use sizeof(varname) when you take the size of a particular variable. sizeof(varname) will update appropriately if someone changes the variable type either now or later. You may use sizeof(type) for code unrelated to any particular variable, such as code that manages an external or internal data format where a variable of an appropriate type is not convenient.
Naming
The most important consistency rules are those that govern naming. The style of a name immediately informs us what sort of thing the named entity is: a type, a variable, a function, a constant, a macro, etc., without requiring us to search for the declaration of that entity. The pattern-matching engine in our brains relies a great deal on these naming rules.
Naming rules are pretty arbitrary, but we feel that consistency is more important than individual preferences in this area, so regardless of whether you find them sensible or not, the rules are the rules.
General Naming Rules
Function names, variable names, and filenames should be descriptive; eschew abbreviation.
Give as descriptive a name as possible, within reason. Do not worry about saving horizontal space as it is far more important to make your code immediately understandable by a new reader. Do not use abbreviations that are ambiguous or unfamiliar to readers outside your project, and do not abbreviate by deleting letters within a word. The exception to this rule is loop iterator variables. In those cases, i, iter and the like are acceptable.
int price_count_reader; // No abbreviation.
int num_errors; // "num" is a widespread convention.
int num_dns_connections; // Most people know what "DNS" stands for.
int n; // Meaningless.
int nerr; // Ambiguous abbreviation.
int n_comp_conns; // Ambiguous abbreviation.
int wgc_connections; // Only you know what this abbreviation stands for.
int pc_reader; // Lots of things can be abbreviated "pc".
int cstmr_id; // Deletes internal letters.
File Names
Filenames should be all lowercase with words separated by underscores (_). Follow the convention that your project uses.
C files should end in .c and header files should end in .h. Files that rely on being textually included at specific points should end in .inc (see also the section on self-contained headers).
Do not use filenames that already exist in /usr/include, such as db.h.
In general, make your filenames very specific. For example, use http_server_logs.h rather than logs.h. A very common case is to have a pair of files called, e.g., foo_bar.h and foo_bar.c, defining a class called FooBar.
Inline functions must be in a .h file. If your inline functions are very short, they should go directly into your .h file.
Type Names
Type names are all lowercase, and end with a "_t" suffix. Structures should include a typedef to remove the need to include the struct keyword throughout the code. This typedef is typically done as part of the structure definition, e.g.:
typedef struct _foo {
char *name;
} foo_t;
Variable Names
The names of variables and data members are all lowercase, with underscores between words. For instance: a_local_variable, a_struct_data_member.
For variables of pointer types, prefix the name with "p_". Additionally, use "pp_" for pointer-to-pointer types. If you need three layers of indirection, consider restructuring your code.
For variables with an ambiguous unit type (e.g. time, distance), include the unit of measure as the final word in the variable name, e.g.
uint32_t delay_seconds; // OK - Includes unit of measure
size_t *p_length; // OK - Uses p_ for pointer type
uint32_t distance; // Bad - No unit of measure. Is this meters, feet, or furlongs?
Common Variable Names
For example:
char *table_name; // OK - uses underscore.
char *tablename; // OK - all lowercase.
char *tableName; // Bad - mixed case.
Struct Data Members
Data members of structs, both static and non-static, are named like ordinary nonmember variables. Avoid repeating the data type in struct member names.
typedef struct _table {
size_t row_length; // OK - uses underscore, all lowercase
char *table_name; // Bad - repeats data type in member name
} table_t;
Global Variables
Global variables should be rare, but if you use one, prefix it with g_ to easily distinguish it from local variables.
Constant Names
Constants defined with preprocessor macros should be all uppercase, separated by underscores. Additionally, consider prefixing the name of defined values with the subsystem or module for which the constant is relevant, e.g.:
#define FOO_MAX_VALUE 32
Function Names
Functions follow rules similar to variable names. They are all lowercase, and separated by underscores. When defining non-static, non-utility functions, include the subsystem or module as a prefix to the function name to avoid conflicts with common function names, e.g.:
int foo_measure_string(char *);
Enumeration Names
Enumeration types should follow the general rules for types. Type names are all lowercase, separated by underscores. They should also include an in-band typedef. Values within an enum should be named according to the rules for macros: all uppercase and separated by underscores.
typedef enum _foo_error {
OK = 0,
OUT_OF_MEMORY = 1,
MALFORMED_INPUT = 2,
} foo_error_t;
Macro Names
Macros should be named with all capitals and underscores, e.g.:
#define ROUND ...
#define PI_ROUNDED 3.0
Comments
Though a pain to write, comments are absolutely vital to keeping our code readable. The following rules describe what you should comment and where. But remember: while comments are very important, the best code is self-documenting. Giving sensible names to types and variables is much better than using obscure names that you must then explain through comments.
When writing your comments, write for your audience: the next contributor who will need to understand your code. Be generous — the next one may be you!
Comment Style
Use either the // or /* */ syntax, as long as you are consistent.
You can use either the // or the /* */ syntax; however, // is much more common. Be consistent with how you comment and what style you use where
File Comments
Every file should have a comment at the top describing its contents.
Generally a .h file will describe the classes that are declared in the file with an overview of what they are for and how they are used. A .c file should contain more information about implementation details or discussions of tricky algorithms. If you feel the implementation details or a discussion of the algorithms would be useful for someone reading the .h, feel free to put it there instead, but mention in the .c that the documentation is in the .h file.
Do not duplicate comments in both the .h and the .cc. Duplicated comments diverge.
Function Comments
Declaration comments describe use of the function; comments at the definition of a function describe operation.
Function Declaration
Every function declaration should have comments immediately preceding it that describe what the function does and how to use it. These comments should be descriptive ("Opens the file") rather than imperative ("Open the file"); the comment describes the function, it does not tell the function what to do. In general, these comments do not describe how the function performs its task. Instead, that should be left to comments in the function definition.
Types of things to mention in comments at the function declaration:
- What the inputs and outputs are.
- If the function allocates memory that the caller must free.
- Whether any of the arguments can be a null pointer.
- If there are any performance implications of how a function is used.
- If the function is re-entrant. What are its synchronization assumptions?
Do not be unnecessarily verbose or state the completely obvious.
Function Definitions
If there is anything tricky about how a function does its job, the function definition should have an explanatory comment. For example, in the definition comment you might describe any coding tricks you use, give an overview of the steps you go through, or explain why you chose to implement the function in the way you did rather than using a viable alternative. For instance, you might mention why it must acquire a lock for the first half of the function but why it is not needed for the second half.
Note you should not just repeat the comments given with the function declaration, in the .h file or wherever. It's okay to recapitulate briefly what the function does, but the focus of the comments should be on how it does it.
Variable Comments
In general the actual name of the variable should be descriptive enough to give a good idea of what the variable is used for. In certain cases, more comments are required.
Struct Members
Each member should have a comment describing what it is used for. If the variable can take sentinel values with special meanings, such as a null pointer or -1, document this.
Global Variables
As with data members, all global variables should have a comment describing what they are and what they are used for
Implementation Comments
In your implementation you should have comments in tricky, non-obvious, interesting, or important parts of your code.
Explanatory Comments
Tricky or complicated code blocks should have comments before them.
// Divide result by two, taking into account that x
// contains the carry from the add.
for (int i = 0; i < result->size(); i++) {
x = (x << 8) + (*result)[i];
(*result)[i] = x >> 1;
x &= 1;
}
Line Comments
Also, lines that are non-obvious should get a comment at the end of the line. These end-of-line comments should be separated from the code by at least 1 space. Example:
// If we have enough memory, mmap the data portion too.
mmap_budget = MAX(0, mmap_budget - index->length);
if (mmap_budget >= data_size && !mmap_data(mmap_chunk_bytes, mlock))
return; // Error already logged.
If you have several comments on subsequent lines, it can often be more readable to line them up.
Function Parameters
When you pass in a null pointer, or literal integer values to functions, you should consider adding a comment about what they are, or make your code self-documenting by using constants. For example, compare:
uint32_t status = calculate_something(interesting_value,
10,
0,
NULL); // What are these arguments??
versus:
uint32_t status = calculate_something(interesting_value,
10, // Default base value.
0, // Not the first time we're calling this.
NULL); // No callback.
Note that you should never describe the code itself. Assume that the person reading the code knows C better than you do, even though he or she does not know what you are trying to do:
// Now go through the b array and make sure that if i occurs,
// the next element is i+1.
... // Geez. What a useless comment.
Punctuation, Spelling and Grammar
Pay attention to punctuation, spelling, and grammar; it is easier to read well-written comments than badly written ones.
Comments should be as readable as narrative text, with proper capitalization and punctuation. In many cases, complete sentences are more readable than sentence fragments. Shorter comments, such as comments at the end of a line of code, can sometimes be less formal, but you should be consistent with your style.
Although it can be frustrating to have a code reviewer point out that you are using a comma when you should be using a semicolon, it is very important that source code maintain a high level of clarity and readability. Proper punctuation, spelling, and grammar help with that goal.
TODO Comments
Use TODO comments for code that is temporary, a short-term solution, or good-enough but not perfect.
TODOs should include the string TODO in all caps. The main purpose is to have a consistent TODO that can be searched to find out how to get more details upon request. If code is being reviewed with TODO, expect to answer why it is still present. Most commonly, this is used to denote where some out-of-scope addition or enhancement will be added.
If your TODO is of the form "At a future date do something" this probably belongs in a ticket.
Formatting
Coding style and formatting are pretty arbitrary, but a project is much easier to follow if everyone uses the same style. Individuals may not agree with every aspect of the formatting rules, and some of the rules may take some getting used to, but it is important that all project contributors follow the style rules so that they can all read and understand everyone's code easily.
Line Length
Each line of text in your code should be at most 100 characters long.
Exceptions
- If a comment line contains an example command or a literal URLUniform Resource Locator longer than 100 characters, that line may be longer than 100 characters for ease of cut and paste
- A raw-string literal may have content that exceeds 100 characters. Except for test code, such literals should appear near top of a file.
- An #include statement with a long path may exceed 100 columns.
Line Endings
Use Unix line endings. Windows is tolerant of this approach, while Linux systems are less tolerant. You should configure your editor to save files with Unix line endings.
Spaces vs. Tabs
Use only spaces, and indent 4 spaces at a time.
We use spaces for indentation. Do not use tabs in your code. You should set your editor to emit spaces when you hit the tab key.
Function Declarations and Definitions
Return type on the line previous to the function name, parameters on the same line if they fit. Placing the function name at the margin makes searching for its definition in an unfamiliar code base much simpler. Wrap parameter lists which do not fit on a single line as you would wrap arguments in a function call.
uint32_t
function_name(char *name) {
...
}
If you have too much text to fit on one line:
uint32_t
really_long_function_name(char *name,
size_t length,
uint32_t value) {
...
}
Some points to note:
- The open parenthesis is always on the same line as the function name.
- There is never a space between the function name and the open parenthesis.
- There is never a space between the parentheses and the parameters.
- The open curly brace is always at the end of the same line as the last parameter.
- The close curly brace is either on the last line by itself or (if other style rules permit) on the same line as the open curly brace.
- There should be a space between the close parenthesis and the open curly brace.
- All parameters should be named, with identical names in the declaration and implementation.
- All parameters should be aligned if possible.
If some parameters are unused, comment out the variable name in the function definition or use the UNUSED_PARAMETER macro.
Function Calls
Either write the call all on a single line, wrap the arguments at the parenthesis, or start the arguments on a new line indented by four spaces and continue at that 4 space indent. In the absence of other considerations, use the minimum number of lines, including placing multiple arguments on each line where appropriate.
Function calls have the following format:
uint32_t status = do_something(argument1, argument2, argument3);
If the arguments do not all fit on one line, they should be broken up onto multiple lines, with each subsequent line aligned with the first argument. If you wrap any arguments, wrap all arguments, with one argument per line. Do not add spaces after the open paren or before the close paren:
uint32_t status = do_something(averyveryveryverylongargument1,
argument2,
argument3);
Arguments may optionally all be placed on subsequent lines with a four space indent:
if (...) {
...
...
if (...) {
DoSomething(
argument1, // 4 space indent
argument2,
argument3,
argument4);
}
}
Sometimes arguments form a structure that is important for readability. In those cases, feel free to format the arguments according to that structure:
// Transform the widget by a 3x3 matrix.
widget_ransform(x1, x2, x3,
y1, y2, y3,
z1, z2, z3);
Conditionals
Prefer no spaces inside parentheses. The if and else keywords belong on separate lines.
Include a space between the if and opening parenthesis. Do not include spaces between the conditional statement and parentheses.
if (condition) { // no spaces inside parentheses
... // 4 space indent.
} else if (...) { // The else goes on the same line as the closing brace.
...
} else {
...
}
If statements should always include curly braces, even if there is only a single statement within the conditional.
Loops and Switch Statements
Annotate non-trivial fall-through between cases.
Braces should be included on loops, even for single-statement loops. Empty loop bodies should use {} or continue, but not a single semicolon.
switch (var) {
case 0: // 0 space indent
... // 4 space indent
break;
case 1:
...
break;
default:
assert(0);
}
Pointer and Reference Expressions
No spaces around period or arrow. Pointer operators do not have trailing spaces.
The following are examples of correctly-formatted pointer and reference expressions:
x = *p;
p = &x;
x = r.y;
x = r->y;
Note that:
- There are no spaces around the period or arrow when accessing a member.
- Pointer operators have no space after the * or &.
When declaring a pointer variable or argument, you should place the asterisk adjacent to the variable name:
char *c;
Boolean Expressions
When you have a boolean expression that is longer than the standard line length, be consistent in how you break up the lines.
In this example, the logical AND operator is always at the end of the lines:
if (this_one_thing > this_other_thing &&
a_third_thing == a_fourth_thing &&
yet_another && last_one) {
...
}
Note that when the code wraps in this example, both of the && logical AND operators are at the end of the line. Wrapping all operators at the beginning of the line is also allowed. Feel free to insert extra parentheses judiciously because they can be very helpful in increasing readability when used appropriately.
Return Values
Do not needlessly surround the return expression with parentheses.
Use parentheses in return expr; only where you would use them in x = expr;.
Favor performing any operation prior to the return itself. This aids in debugging at the potential expense of an additional local variable.
return result; // No parentheses in the simple case.
// Parentheses OK to make a complex expression more readable.
return (some_long_condition &&
another_condition);
return (value); // You wouldn't write var = (value);
return(result); // return is not a function!
Preprocessor Directives
The hash mark that starts a preprocessor directive should always be at the beginning of the line.
Even when preprocessor directives are within the body of indented code, the directives should start at the beginning of the line.
Optionally, include an end-of-line comment at an endif noting the #if condition.
// Good - directives at beginning of line
if (lopsided_score) {
#if DISASTER_PENDING // Correct – Starts at beginning of line
drop_everything();
# if NOTIFY // OK – Spaces after #
notify_client();
# endif
#endif // DISASTER_PENDING
back_to_normal();
}
// Bad - indented directives
if (lopsided_score) {
#if DISASTER_PENDING // Wrong! The "#if" should be at beginning of line
DropEverything();
#endif // Wrong! Do not indent "#endif"
BackToNormal();
}
Horizontal Whitespace
Use of horizontal whitespace depends on location. Never put trailing whitespace at the end of a line.
Loops and Conditionals
if (b) { // Space after the keyword in conditions and loops.
} else { // Spaces around else.
}
while (test) {} // There is usually no space inside parentheses.
// For loops always have a space after the semicolon.
for (i = 0; i < 5; ++i) {
for ( ; i < 5; ++i) {
...
switch {
case 1: // No space before colon in a switch case.
...
case 2: break; // Use a space after a colon if there's code after it.
Operators
// Assignment operators always have spaces around them.
x = 0;
// Other binary operators usually have spaces around them, but it's
// OK to remove spaces around factors. Parentheses should have no
// internal padding.
v = w * x + y / z;
v = w*x + y/z;
v = w * (x + z);
// No spaces separating unary operators and their arguments.
x = -5;
++x;
if (x && !y)
...
Casts
// No spaces inside parentheses.
y = (char *)x;
// Optional space after parentheses, but before value
y = (char *) x;
Vertical Whitespace
Minimize use of vertical whitespace.
This is more a principle than a rule: don't use blank lines when you don't have to. In particular, don't put more than one or two blank lines between functions, resist starting functions with a blank line, don't end functions with a blank line, and be discriminating with your use of blank lines inside functions.
The basic principle is: The more code that fits on one screen, the easier it is to follow and understand the control flow of the program. Of course, readability can suffer from code being too dense as well as too spread out, so use your judgement. But in general, minimize use of vertical whitespace.
Some rules of thumb to help when blank lines may be useful:
- Blank lines at the beginning or end of a function very rarely help readability.
- Blank lines inside a chain of if-else blocks may well help readability.
Best Practices
Return Values
Aggressively check the return value from all functions with meaningful return values, especially memory allocation. Validate no errors have occurred, and gracefully recover or terminate when they do.
Consider including an error message at the point of function return when an error is detected. This can double as documentation, and greatly helps track down an error chain, e.g.:
fp = fopen("/path/to/file", "rb");
if (fp == NULL) {
DBGERR("Failed to open /path/to/file for reading: %d\n", errno);
status = errno;
goto ErrorExit;
}
...
ErrorExit:
return status;
Use platform error numbers when they are available. Define meaningful constants for your own error codes, and ensure their values do not overlap with platform definitions.
When an error causes a return (rather than retry or other recovery), propagate the original error code as far as possible.
Error Handling
Use a single point of exit for functions, and use a goto to reach it in error cases. An exception to this is early parameter value checks, prior to any local initialization, may return an error directly, e.g.:
int32_t
calculate_something(uint8_t *param_1, uint8_t *param_2) {
int32_t status = S_OK;
if (param_1 == NULL || param_2 == NULL) {
return ERR_INVALID_PARAMETER;
}
...
return status;
}
Free resources and perform cleanup in the exit block. Avoid more than one goto label per function, and do not use goto for other control flow.
At function starts, initialize variables with sentinel values. Check those same sentinel values in error handling blocks to determine whether cleanup must be done, e.g.:
uint8_t *buf = NULL;
...
buf = malloc(100);
...
ErrorExit:
if (buf) {
free(buf);
}
Strings in Programs
Most C code written at OEC is intended for deployments in austere environments, and at risk of being reverse engineered. While meaningful error messages can greatly aid debugging and determining the root cause of issues, they should not be included in release builds of products for size and "opsec" reasons.
Wrap error reporting routines in preprocessor macros to ensure they are replaced with noop code in a release build. Use these macros exclusively to avoid accidentally including a stray print or error message.
Be cautious when implementing the macro to remove these routines. Consider the following:
#define DBGPRINT printf // Bad – Will still include any arguments!
#define DBGPRINT(fmt, ...) printf(fmt, ##_VA_ARGS_) // OK – Includes arguments in the macro definition
DBGPRINT("function_name: %s: %d\n", error_message(), errno); // In the first case, this will still include the string "function_name" and a call to error_message()
Security Considerations
Integer Overflow
Integer overflows in C result from arithmetic on values whose results are beyond the finite storage available in variables. When overflown values are used (e.g. as pointers, offsets, or allocation sizes) security issues (and therefore also reliability issues) result.
It is very easy to get these checks wrong, and checks that rely on machine-specific overflow behavior such as wraparound are liable to be optimized out by the compiler (see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475). This section outlines some templates that can be used to construct safe, effective checks.
When using a modern version of GCC, use the arithmetic overflow checking builtins (listed here: https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html). Since the toolchains in use at OEC do not always allow us to use the latest and greatest versions, we must know how to perform these checks safely on our own.
Addition / Subtraction
Consider the following code, that processes a length-value protocol with a 4-byte length, followed by content:
int32_t
process_buffer(uint8_t *buffer, size_t buffer_size) {
int32_t rc = 0;
size_t offset = 0;
uint32_t item_len = 0;
while (offset < buffer_size) {
// Ensure there is sufficient space to read the next item's length
// If the amount we want to access (sizeof(uint32_t)) exceeds the amount remaining (size - offset) then error out
if (sizeof(uint32_t) > buffer_size - offset) {
rc = ERR_INSUFFICIENT_DATA;
goto ErrorExit;
}
// Extract 4-byte length from buffer
item_len = *((uint32_t *) (buffer + offset));
// Ensure there is sufficient space to read the stated amount
// Be sure to include 4-byte length field in space calculations
if (item_len > buffer_size - offset - sizeof(uint32_t)) {
rc = ERR_INSUFFICIENT_DATA;
goto ErrorExit;
}
rc = process_some_data(buffer + offset + sizeof(uint32_t),
buffer_size - offset - sizeof(uint32_t));
// ...
offset += sizeof(uint32_t) + item_len;
}
ErrorExit:
return rc;
}
The intuitive method of checking the result of addition is unsafe. The following examples demonstrate different methods for performing overflow checks. All examples in this block are checking for the error state.
if (offset + item_len < offset) { // Bad – Relies on machine-specific overflow behavior and may be optimized out
if (offset + item_len > buffer_size) { // Bad – offset + item_len may overflow, and pass this check
if (item_len > buffer_size - offset) { // OK – We check if the amount we wish to access exceeds actual amount remaining
if (buffer_size - offset < item_len) { // OK – We check if the amount remaining is less than desired amount
Multiplication
The only correct way to test for integer overflow is to divide the maximum allowable result by the multiplier and comparing the result to the multiplicand or vice-versa. If the result is smaller than the multiplicand, the product of those two values would cause an integer overflow. Since that's about as clear as mud, here's an example:
#define SIZE_MAX ((size_t)-1) // It's important that SIZE_MAX be the maximum possible stored value
// In the positive case
if (n > 0 && m > 0 && SIZE_MAX/n >= m) {
size_t bytes = n * m;
... // allocate "bytes" space
}
// Checking for error case */
if (n == 0 || m == 0 || SIZE_MAX/n < m) {
// Set error condition
}
And a negative example:
size_t bytes = n * m;
if (bytes < n || bytes < m) { // Bad – Relies on machine-specific overflow behavior and may be optimized out
... // allocate "bytes" space
}
Assembly
Including Assembly in a C Project
Do not directly embed inline assembly in C. Instead, create a separate module with a .s extension, define symbols and link in the assembled object.
Best Practices
Standards for quality should be consistently applied to assembly as they would in C. Unless each byte is significant (and they may be), continue to check return values, clean up resources and the like.
Nearly every line in assembly source should include a comment as to the intent. Apply the comment guidelines. Note meaning and motivation, not mechanics, unless you are employing some unintuitive trick or machine behavior.
Parting Words
Use common sense and BE CONSISTENT.
If you are editing code, take a few minutes to look at the code around you and determine its style. If their comments have little boxes of stars around them, make your comments have little boxes of stars around them too.
The point of having style guidelines is to have a common vocabulary of coding so people can concentrate on what you are saying, rather than on how you are saying it. We present global style rules here so people know the vocabulary. But local style is also important. If code you add to a file looks drastically different from the existing code around it, the discontinuity throws readers out of their rhythm when they go to read it. Try to avoid this.