Special Files

Special files are often referred to as “device files”, as that is their primary purpose–to allow processes to interact with hardware and software-emulated devices, using the file system as a medium. In a UNIX system, special files are conventionally all located in the /dev/ directory, and represent things like disk drives, audio devices, physical and virtual terminals, and so on.

A special file doesn’t contain any data in the usual sense–instead, the block pointers in the file’s inode (which normally point to file contents) are repurposed to contain two integers, called the major and minor numbers. The major number uniquely identifies the type of the special file–that is, which device driver it belongs to, and the minor number enumerates each special file that belongs to a particular driver, in case there are multiple devices of the same type–e.g. multiple hard disks, multiple displays, multiple teletype terminals, etc.

The contents of special files are generated dynamically by kernel drivers–each major number is associated with a particular driver. When a process reads or writes to a special file, the request is dispatched directly to implementations of those system calls provided by the driver for that particular type of device. Since special files are implemented by kernel drivers, user-space programs typically cannot create or destroy special files, but some drivers provide mechanisms for user programs to do so; for example, processes can create new pseudo terminal devices by opening the special /dev/ptmx file, which generates a new /dev/ptsX file corresponding to the new pseudo terminal.

There are two basic types of special files, called character special files and block special files. Character special files are so called because they are accessed one character at a time, do not have persistent data, and are not seekable. These model event-driven devices that consume and produce streams of data, such as a terminal, printer, etc. On the other hand, block special files behave a lot like regular files in that they are seekable and act like a contiguous block of data that can be repeatedly read and written to at arbitrary file offsets. Block special files model stateful devices like disk drives, display buffers, and external hardware devices such as sensors or control panels that provide continuously updated data rather than streams of events.

There are several important character special files that represent pseudo-devices, such as /dev/null, /dev/zero, and /dev/random. These files are used frequently by users and programs to generate input and discard output via the filesystem. To illustrate what this driver code looks like, let’s examine how a hypothetical driver might implement these special files.

/dev/null

The special /dev/null file is called the “black hole” because it consumes any data that is written to it, and always generates end-of-file (read() returns 0) when read from. It’s often used to discard output of processes and provide empty input to processes. For example, when running a shell command in the background, its output is typically redirected to /dev/null so that it doesn’t clutter the terminal with output.

Implementing a reading and writing on /dev/null is trivial:

int
read(int fd, void *buf, size_t n)
{
  return 0;
}

int
write(int fd, void *buf, size_t n)
{
  return n;
}

/dev/zero

The /dev/zero special file fills every read request with null bytes, and, like /dev/null, consumes any data written to it. It’s often used to generate dummy files, initialize data storage or pad files, overwrite sensitive data, and provide long inputs to programs under testing. Like /dev/null, this one is fairly straightforward:

int
read(int fd, void *buf, size_t n)
{
  memset(buf, 0, n);
  return n;
}

/dev/random

The /dev/random file fills every read request with random bytes, and, like the previous files, also consumes data written to it. It is used for similar purposes to /dev/zero, but also used as a source of entropy for process input. Unlike /dev/null and /dev/zero, /dev/random is usually somewhat complex. The kernel generates secure random data from sources of entropy in the system, such as user input, system clocks, and so on. When a process reads from /dev/random, the kernel fills the request buffer from that generated random data, blocking until enough data is available, if necessary. Alternatively, the /dev/urandom special file uses pseudo-random generation that is periodically seeded from the high-entropy sources used for /dev/random, so that data is always available immediately, but with reduced cryptographic-quality. A simple implementation of the latter might be:

int
read(int fd, void *buf, size_t n)
{
  unsigned char *buf_data = buf;
  for (size_t i = 0; i < n; ++i) *buf_data++ = rand();
  return n;
}