Additional Concepts

When working with file systems, there are several special concepts that are worth understanding in order to more effectively manage files and directories. These concepts include dot and dot-dot hard links, hidden files, special device files, and the home directory.

Relative and Absolute Paths

In Unix systems, a path may either be relative to the current working directory of a process, or absolute to the root directory of the file system. Absolute paths always start with a forward slash (/) while relative paths do not. So, /file refers to “file” in the root directory, while file refers to “file” in the current working directory.

Hidden Files (Dotfiles)

In Unix, files and directories that begin with a dot (.) are considered hidden files. These files are typically used for configuration data, and are not displayed by default in directory listings. Wildcard filename expansion also doesn’t produce the names of hidden files, by default, except when a leading dot is explicitly provided (e.g. .* matches hidden files, but * does not).

Dot (.) and Dot-Dot (..)

Recall that most modern file systems do not allow the explicit creation of additional hard links to directories, as this could result in circular references which would lead to orphaned directory nodes. It also ensures that every directory has exactly one parent directory (note, the / root directory is its own parent!).

Within every directory, two special system-managed hard links are created, called dot (.), which points at the directory, and dot-dot (..), which points at the parent directory. These don’t have any of the issues that arbitrary hard links to directories would have, since they are managed automatically by the system. When a directory is removed, the system deletes these two links before unlinking the directory from its parent directory, ensuring that reference counts are handled correctly.

The . directory doesn’t show up often, but a common use for it is to explicitly refer to a file in the current directory. Sometimes this is necessary because the name of a file matches a reserved word or has some other special meaning. Prepending ./ to the file name still refers to the same file–e.g. ./test instead of test–and disambiguates it from a reserved word or other special meaning.

The .. directory can be very useful for navigation, since moving up to a parent directory would otherwise require entering in the entire path to the parent directory again.

Special Device Files

Special device files provide access to hardware devices and other system resources. These files are located in the /dev directory and are implemented as driver code in the kernel. When a program reads from or writes to a device file, the kernel calls the driver’s custom read and write methods to process the data. This allows programs to interact with devices using the same methods used to read and write regular files.

Device files make it easier for developers to access and manage hardware devices and system resources by abstracting the underlying implementation details of the device. Instead of having to write custom code to communicate with each device, developers can use the standard file I/O operations that are familiar to them. This makes it easier to write portable and maintainable code that can work with a wide range of hardware devices and system resources.

Many device files represent a one-to-one relationship with a piece of physical hardware, but many others are actually entirely virtual, emulated devices, that exist only as driver code. Their contents are generated on-the-fly. These devices are frequently encountered by regular users, so they are worth mentioning.

/dev/zero

This device generates an endless stream of null (value of zero) bytes, and discards any bytes written to it. This is useful for initializing files to a certain size, overwriting a file with all zeros, and to test programs with large inputs.

/dev/random

This device generates an endless stream of random bytes, and discards any data written to it. This is useful for generating cryptographic keys. /dev/random takes entropy from various sources like keyboard and mouse events, disk activity, and other hardware-based random noise. Because of this, it’s a good source of cryptographically secure data, but it can be slow to read from when it runs out of entropy and needs to accumulate more.

/dev/urandom

This device works just like /dev/random, except that it does not wait for entropy to accumulate – it will produce more random bytes using a deterministic pseudo-random number generation algorithm when available entropy runs out. This makes it a faster, but theoretically less secure, source of randomness. This is often used to generate things like temporary file names or to test random inputs to programs, but should probably not be used for high-security cryptographic purposes.

/dev/null

This device always returns eof (end-of-file) when read from, and discards data written to it. It’s commonly nicknamed the “black hole” device. /dev/null is used very frequently to discard the output of a program or to feed an “empty” file to a program that requires a file to read from.

/dev/pts/*

The device files in this directory are pseudo-terminal slave (PTS) devices. Various applications need to emulate the behavior of an old-school physical terminal – notably, terminal emulators. We will learn more about how this works later in the course.

Home Directory (~/)

The tilde (~) is shorthand for the current user’s home directory. This is the default location for storing user-specific files, and regular users have very limited permissions outside of their home directories.

Note that the expansion of ~ into a user’s home directory path is actually handled by the shell before it runs a command, so it’s not a true file path that other system interfaces and utilities understand directly.

A less well-known feature of tilde expansion is that it can be used to locate other user’s home directories by appending their username after the tilde. For example, ~root expands to the root user’s home directory, which is typically /root. This is often used by system administrators to quickly access other users’ directories without having to navigate through the file system.

Now, why ~, of all characters? For that we have to go all the way back to the ADM-3A terminal of 1976, an influential terminal that was used by many of the people who developed Unix and the programs that would eventually become part of the operating system. Here is a diagram of the ADM-3A keyboard layout; notice that the Home key in the upper right hand corner doubles as the ~ key:

KB Terminal ADM3A
ADM-3A Terminal Keyboard Layout