Overview

The C programming language can be organized into the following major topics, each of which is intricately linked with the others,

Type System

The type system establishes how data is organized and represented through several fundamental types and provides additional methods of constructing derived types.

Object Model

The object model describes the representation of program state as a collection of objects, each containing a value of a particular type.

Identifiers

Identifiers are names which refer to program entities such as objects and functions; C provides mechanisms for introducing identifiers, controlling the contexts in which those identifiers are visible, and linking identifiers in separate locations to refer to the same underlying object.

Expressions

Expressions encompass a set of operators and their rules for evaluation which, when evaluated, produce values, designate objects, or modify program state.

Abstract Machine Model

The abstract machine model specifies the rules that govern the procedural structure and execution of a C program, and its externally visible behavior, regardless of the environment in which it is executed.

Preprocessor Language

The preprocessor language is a macro-replacement language that consists of a set of directives and macros that are used to manipulate and modify source code before it is compiled.

Standard Library

The standard library is a set of functions, derived types, and objects which provide a uniform interface for a variety of programming tasks across different environments, ensuring portability of programs which rely on these features.

Example

Even in a simple Hello World! program, each of the above concepts can be observed extensively,

1/* This program prints "Hello World!" to standard output */
2#include <stdio.h>
3
4int main(int argc, char *argv[])
5{
6   puts("Hello World!");
7}
Type System

Line 4 showcases the C type system. The int and char keywords, abbreviations for integer and character, refer to two of the most important fundamental types. Additionally, two derived types are visible; argv is declared with type char *[]–an array of pointers to char–and main is declared with type int(int, char *[])–a function which returns an int, and takes to parameters of type int and array of pointers to char.

Object Model

Line 4 also touches on the object model; declarations of argc and argv in the function parameter list each allocate int and char*[] objects, respectively, on execution of main. These objects are discarded when main returns.

Furthermore, on line 6, the string literal "Hello World!" allocates a char[13] object at the start of the program, filled with the literal characters and a null-terminator ('\0'). This array is destroyed when the program terminates.

Identifiers

Line 4 showcases the declaration syntax of the C language, and includes a declaration for the main function, along with each of its parameters, argc and argv.

The identifiers, main, args, and argv are all declared on line 4. Additionally, the identifier puts is evaluated as part of an expression on line 6; the declaration for puts is hidden in the included stdio.h header file.

Expressions

Line 6 contains the function-call expression puts("Hello World!") which can further be broken down into two sub-expressions, puts and "Hello World!".

The puts expression yields a function designator referring to a function of type int(char const *). The "Hello World!" expression yields an object designator referring to the char[13] array mentioned earlier.

When the function call operator () is evaluated, the function designator that precedes it is implicitly converted to a function pointer int (*)(char const *) and the object designator for its first argument is implicitly converted to the type char *, and then to the type char const *, through the process of pointer decay and argument conversion.

The overall expression yields the return value of the call to puts, which is unused.

Abstract Machine Model

The abstract machine model dictates the main function as the program entry point, and that it should take either no parameters, or two parameters of the types shown here, int and char*[]. The name argc is short for argument count, and contains the number of command-line arguments to the program. The name argv is short for argument vector, and is an array of pointers, each of which points to the beginning of each command-line argument string.

Within the body of the main function, the expression statement puts("Hello World!"); is procedurally evaluated when main is called. The abstract machine model requires that data referenced in the call to puts is written to standard output by program termination, assuming successful execution.

Preprocessor Language

The preprocessor converts the comment /* ... */ on line 1 into a <space> character, and it directly inserts the contents of stdio.h at the location of the #include directive on line 2.

Standard Library

The inclusion of the stdio.h header file, which is short for “Standard I/O”, exposes many declarations for file input/output related facilities, such as the puts function. All functions declared in standard library headers have external linkage and are defined separately in the system’s implementation of the standard library, which allows them to be referenced here.

C Standard

This course is based on the 1999 edition of the C language standard, called ISO/IEC 9899:1999, or C99, for short. While the C language has been revised since then, most revisions have been minor and/or irrelevant to operating systems, so C99 remains the most widely used major revision of the language. While the C99 standard is available for sale from ISO, the latest working draft prior to its official publication is available for free, online, and is essentially identical to the finalized version. This draft is hosted at the JTC1/SC22/WG14 home page, as document N1256, and is known as 9899:TC3, where TC3 stands for “Technical Corrigendum 3”.

A digitized version of this standard is available at C99 Language Standard.