Coding Style

Coding style is highly personal, and you should experiment with what works well for you. The following document should serve as a guide of some “sane” default coding conventions in C. When editing on os1 in vim, you can auto-format your code from normal mode using the \-f shortcut. This is one major advantage that C has over python–whitespace has no significance; this means you can move around code without having to manually adjust indentation levels, and then run it through a formatter in a single keystroke. Not bad!

This coding style is loosely based off of the Linux kernel coding style.

Indentation

Indentation may be 2, 4, or 8 spaces, or one tab.

Do not put multiple statements on a single line:

x = 12; y = 4;

Do not leave whitespace at the end of lines.

Line Length

The preferred limit on the length of a single line is 100 columns. If a statement, expression, or declaration spans multiple lines, use a hanging-indent that is not easily confused with indented block-level code.

Braces

Function blocks begin and end with a single brace in column 0:

void foo(void)
{
  /* do something */
}

Selection and Iteration statements have the opening brace at the end of the line, and the closing brace at the same indentation level as the surrounding code. The closing brace should appear on its own line, except in the case where it is followed by the continuation of the same statement, such as with a do-while loop or an else in an if-statement, like so:

if (x == y) {
  ..
} else if (x > y) {
  ...
} else {
  ....
}

do {
  ...
} while (condition);

while (condition) {
  ...
}

Spacing

Use spaces between control statement keywords, the parenthesized expressions, and the opening brace;

Good:

if (condition) {

while (condition) {

/* etc. */

Bad:

if(condition){

while(condition){

/* etc. */

Do not add spaces around (inside) parenthesized expressions. This example is bad:

s = sizeof ( struct file )

Use one space between most binary/ternary operators and their operands:

= + - > < * / % | & ^ <= >= == != ? :

but no space between these unary operators and their operands:

& * +- ~ ! ++ --

or around the . and -> structure member operators.

The sizeof operator should always be followed by a space. It is not a function; do not mistake it for one!

Good:

++x;
*y = 12 + *x;
y = m * x + b;
int *const x;
sizeof x;
sizeof (char);

Bad:

++ x;
* y=12 + * x;
y=m*x+b;
sizeof(x);
sizeof(char);

This also applies to the asterisk in a pointer declarator, which is intended to mimic exactly its usage as a unary dereference operator, and which binds to the right:

Correct: T *P

Incorrect: T* P and T * P

Explicit parentheses can illustrate this point more clearly:

Correct: T (*P)

Incorrect: T(* P) and T (* P)

Naming

Do not smushwordstogether or UseCamelCase. Abbreviations are perfectly fine, but global variables must be named descriptively. Do not encode the type of a variable in its name (Hungarian Notation), such as szName.

All object-like macros should be named in UPPER_CASE_SNAKE_CASE. Function-like macros that include an argument that is evaluated more than once must be in UPPER_CASE_SNAKE_CASE; otherwise they can be in lower-case.

Good:

#define MAX_ARGS 1024
#define MAX(a, b) (a > b ? a : b)
#define hi(x) (x & 0xF0) // Ok, x evaluated only once
#define array_len(x) (sizeof (x) / sizeof (*x))   // sizeof does not evaluate its expression
#define ARRAY_LEN(x) (sizeof (x) / sizeof (*x))   // Also ok

Bad:

#define max_args 1024
#define MaxArgs 1024
#define max(a, b) (a > b ? a : b)

ALL enumeration constants should be written in UPPER_CASE_SNAKE_CASE.

Good:

enum color {
   COLOR_RED,
   COLOR_BLUE,
   COLOR_GREEN
};

All other identifiers should be written in snake_case (i.e. lower-case alphanumerics and underscores separating words). This includes function names, variable names, tag names, and typedef names.

Common short variable names/affixes

Loop Counters

i, j, k

Coordinates

x, y, z

Short-lived variables

  • c or ch for a character

  • s or str for strings

  • n for integers

  • p for void *

  • buf or arr for a buffer/array

  • fp for an open FILE * stream

  • fd for an open file descriptor

  • tmp for anything

Functions

Functions should be short and do one thing. A function should generally be less than 50 lines, with its length inversely proportional to its complexity – a logically simple function can be much longer than a complex one.

As a rule of thumb, you should try not to use more than 10 local variables in a function, since that’s about the limit of what most people can keep track of at once.

Include parameter names with data types in function declarations/prototypes.

This does not apply to the main() function, which may be fairly complex depending on program structure.

Exiting from Functions

A function may return directly at any point as long as it has only modified local variables. Once a function does anything that changes program state (modifies a global variable, modifies objects passed by reference, successfully allocates memory, etc.), all exit paths must pass through a single centralized cleanup-and-exit at the end of the function.

For functions that return a value indicating success or failure:

  • If the name of a function is an action or an imperative command, the function should return a negative value on failure, and 0 on success.

  • If the name is a predicate, the function should return a boolean (1 = true, 0 = false).

For functions that return pointers to objects, they should return NULL on failure.

Commenting

Every function (except main() should be documented using doxygen/Javacdoc style block comments:

/**
* ... text ...
*/

At a minimum, a brief description of the function, each of its parameters, and its return value is necessary:

/**
 * Registers a student into the course roster
 * @param name the student's name
 * @param email the student's email
 * @param id the student's 9-digit ONID
 * @return the index of the student in the course roster, or -1 on failure
 */
int register_student(char *name, char *email, int id) {
    ...
}

Function bodies should generally have few comments; when used, they should describe why you are doing, not how. Anyone can read your code to see how it works, but they want to know why you are doing what you are doing.

Do not commit/submit commented-out code. We do not accept draft work. Use preprocessor conditional compilation to remove debugging code from the release version of your program, using the NDEBUG macro:

#ifndef NDEBUG
  printf("This is a debugging message!")
#endif

Variable Length Arrays

Do not ever use VLAs (variable length arrays). These are arrays that are declared using a variable for the size of the array, instead of a compile-time constant expression. VLAs are widely considered a mistake in the language and create all kinds of unexpected problems and have weird edge cases, including stack overflows.

An example of a VLA:

int x = 10;
char buf[x];

Use dynamic allocation (malloc(), etc.) when you need dynamically sized arrays.

Declarations

Declarations always follow the following order:

  1. Storage class specifiers

  2. Function specifiers

  3. Type specifiers

  4. Type qualifiers

Type specifiers should be written using the shortest canonical name of the respective type. For example, use short in preference to short int; do not use the signed keyword except for signed char, and don’t switch up the order, such as char signed or int unsigned.

Types

Use char only for holding ASCII text character data.

Use unsigned char when working with raw bytes of data.

Use int for generic integers that are known not to exceed the range ±32767.

Use float or double for floating point values. Exact ratios should be stored as two integers to avoid rounding errors.

All other types should be used only in specific situations. For example,

  • short and signed char are sometimes used in large arrays of integers with a restricted range to improve cache performance when there is a memory bottleneck. These are generally less efficient than a plain int and should be avoided otherwise.

  • long and long long are sometimes used when a larger range than int guarantees is desired. They are generally less efficient than a plain int and should be avoided otherwise.

  • size_t for array indices and loop counting. It’s perfectly fine to use a plain int if you know that it won’t overflow; otherwise use size_t which is generally less efficient than an int.

  • The fixed width types in stdint.h should generally only be used when working with data where the exact width is relevant to a particular algorithm or format specification. These are used more ubiquitously in embedded systems to control memory usage.

“The Compiler is not Magic”

Do not assume the compiler will fix your mistakes or lazy/poor coding habits. For example, do not call costly functions over and over in a loop to calculate a result you know shouldn’t change. The following is horrifically bad:

for (size_t i = 0; i < strlen(str); ++i)

vs,

size_t len = strlen(str);
for (size_t i = 0; i < len; ++i)

The first example can easily be 1000’s of times less performant, with a run-time complexity of O(n^2) compared to O(n) in the second example. Unless str is changing length, don’t ever do the first example!

Other common bad practices:

  • Using I/O system calls to read/write data in many small chunks instead of one large request

  • Not checking return values of functions to verify success or handle failure

  • Using absurdly large arrays to avoid dealing with dynamic allocation

General Layout

The general layout of a C source file should be,

  1. Feature-test macros

  2. Include directives

  3. Macro definitions

  4. Externally linked object definitions

  5. Internal types

  6. Internal object declarations

  7. Internal function prototypes

  8. Function definitions, starting with main(), if present

The following example illustrates this layout,

my_project/my_program.c
#define _POSIX_C_SOURCE 200809L /* Feature test macro */
/* System headers first */
#include <stdlib.h>
#include <stdio.h>

/* Then project headers */
#include "my_project/global_counter.h"

/* Finally, the header for *this* source file */
#include "my_project/my_program.h"

/* Macro definitions */
#define ARRAY_SIZE 100
#define sum(a, b) (a + b)

/* Define externally linked objects */
int const magic_number = 42;

/* Define internally used types */
enum counter_state { CS_RUNNING, CS_STOPPED };

/* Define internally linked objects */
static int internal_counter;
static enum counter_state state;

/* Prototype all internal functions */
static int increment_internal_counter(void);
static void start_counter(void);
static void stop_counter(void);


/* always put main() first if it is present */
int main()
{
  /* ... */
}

/* Next, any externally linked function definitions */
int set_global_counter_to_internal_counter()
{
   global_counter = internal_counter;
}

/* Then, internal function definitions */
static int increment_internal_counter()
{
   if (state == CS_RUNNING) ++internal_counter;
   if (global_counter < internal_counter) set_global_counter_to_internal_counter();
}


static void start_counter()
{
   state = CS_RUNNING;
}

static void stop_counter()
{
   state = CS_STOPPED;
}

If a source file defines any externally linked functions (besides main()) these should be prototyped in a separate header and included in the source file. This ensures that there is one consistent function signature shared between users and the implementation. Additionally, any external object definitions should be represented as extern ... declarations in the header, for the same reasons.

my_project/my_program.h
extern int const magic_numer;
int set_global_counter_to_internal_counter(void);