Standard C Allocations

The C standard library manages the heap for C programs. It takes care of automatically resizing the heap as the program’s dynamic memory footprint grows and shrinks, and it handles storing data in the heap so that it can be accessed quickly and efficiently. A typical implementation consists of a linked-list structure, where the heap is broken up into regions, called arenas, and each arena is optimized for allocations of objects of a particular size range. A linked-list like structure is typically used to manage the heap, with metadata about each allocation stored just before the allocated data itself.

In C, objects are allocated onto the heap with the malloc family of functions:

  • void *malloc(size_t n)

  • void *realloc(void *p, size_t n)

  • void *calloc(siz_t nmemb, size_t size)

Each returns a pointer to an object of the requested size, suitably aligned for any data type. The realloc function takes a pointer to an existing allocated object, and attempts to resize it. In some cases this occurs in-place, and in others the object is moved elsewhere to a larger allocated space. In either case, a pointer to the location of the reallocated object is returned. Finally, calloc zero-fills its allocated data and typically is used for efficiently zero-initializing dynamically allocated arrays–thus the separate nmemb and size arguments.

Conventional usage of each is shown below,

/* Allocating a single object */
struct my_struct *my_data = malloc(sizeof *my_data);

/* Allocating an array of objects */
int *int_list = malloc(sizeof *int_list * count);
if (!int_list) /* error handling */;

/* Allocating a zero-filled array of objects */
int_list = calloc(count, sizeof *int_list);

/* Reallocating an existing allocated array */
void *tmp = realloc(int_list, sizeof *int_list * count);
if (!tmp) /* error handling */;
int_list = tmp;

Notice that, in departure from the textbook, which suggests the general form T *d = (T*) malloc(sizeof (T) * count);, modern practice is to use the form, T *d = malloc(sizeof *d * count);. Removing the explicit cast and using the size of the pointed-at object avoids unnecessarily repeating the type of the allocation, reducing the risk of errors, and avoids the need to even know the type at all, if the allocation appears later than the pointer declaration (e.g. int *p; /* ... */; p = malloc(sizeof *p);. It also is generally an easier rule to follow for more complex allocations, such as multiple levels of indirection, e.g. int *const **x = malloc(sizeof (int *const *) * count); vs int *const **x = malloc(sizeof *x * count);.

Additionally, the realloc function is often used as a complete replacement for malloc; as a special case, if the pointer argument to realloc is a null pointer, it is equivalent to calling malloc. This often simplifies initial allocation and reallocation by rolling them into one statement, which is best practice to avoid allocation mistakes:

int *int_list = 0;
int count = 0;
for (;;) {
   int x = getint();
   void *tmp = realloc(int_list, sizeof *int_list * (count + 1));
   if (!tmp) /* error handling */;
   int_list = tmp;
   int_list[count++] = x;
}

When an allocated object is no longer needed, it can be discarded with a call to void free(void *p). This is necessary to avoid memory leaks [1]. Notice that in the above examples with realloc, we are storing the return value of realloc into a temporary variable and handling any reallocation errors before overwriting where our pointer points to. This is very important, because, when reallocation fails, the original object is unmodified and must still be freed to prevent a memory leak. Directly assigning the pointer to the return value of realloc would lose our only reference to that allocated object, preventing us from being able to free it.