Utilities

In this section, we will introduce three important classes of system utilities: the shell, text editors, and documentation browsers. Learning to use these utilities effectively is the first step to completing work in this class and mastering systems programming.

The Shell

The shell, sh, provides command-line interface that allows users to interact with the system by entering commands. The shell is the primary interface for many Unix-like operating systems and provides a powerful and flexible way to interact with the system. The shell allows users to navigate the file system, run programs, manage processes, and perform various system tasks. There are many implementations of the shell utility, based on the POSIX Shell Command Language, and which often add many additional features and quirks on top. In this course, we will use an extremely popular implementation of the shell called bash[1].

At its most basic form, the shell allows a user to enter commands, line-by-line, thus the term command-line interface (CLI). The shell can of course also read its input from a file rather than an interactive terminal, and these files are called shell scripts. When a program is executed, there are three key things that the shell provides to it:

Argument List: Every process is invoked with an array of strings, called arguments. The first argument string is always the name of the command or explicit path to an executable that was used to invoke the process, followed by a sequence of arguments that control the behavior of the program being executed. The shell performs considerable processing of its input to parse each command, including word splitting, tokenizing, expansion of special tokens, field splitting, pathname expansion, and so on, in order to produce this list of arguments.
Environment Variables: Every process is invoked with a second array of strings, called environment variables, which take the form name=value. Environment variables are persistent and are automatically inherited by processes without the need to explicitly enter them in as additional arguments. These variables generally hold global configuration settings, such as localization, terminal dimensions, paths to system resources, and more. The shell language provides mechanisms to manage these environment variables, and many are already set by configurable shell scripts when a user first logs in.
Standard Streams: Processes also inherit open files from their parents, and every process expects to be invoked with three files already open for reading and writing. These are called standard in (stdin), standard out (stdout), and standard error (stderr). Many programs are designed to read input on standard in, process it, and then output the result to standard out. The standard error stream is meant for diagnostic messages such as errors, but also any other informative user-directed output that needs to be separated from regular output. The shell language provides mechanisms to chain the standard output of one process into the standard input of another (called piping), to open and close these files before running a program (called redirection), and even feed explicitly provided input in place of an actual file (called a here-doc).

Additionally, the shell offers conditional flow control constructs such as conditional statements, switches, and loops, allowing for very powerful shell scripting capabilities.

These features make the shell an incredibly powerful and flexible way to interact with and manage a system. A lot of the design of the shell went into making it easy to connect multiple small programs together in different ways to perform complex tasks, exemplifying a core Unix philosophy called DOTADIW – Do One Thing, And Do It Well – the idea that each program should perform just one simple task, and that this should form a sort of “tool box” for users and administrators to build things out of.

Text Editors

Imagine if, instead of a computer screen, your user interface consisted of a keyboard and a printer. This is not too different than the user experience of early computers that used teletype machines. As you can imagine, this didn’t provide much in the way of immediate feedback through screen updates, like we are used to with modern text editors. Even after the development of electronic terminal screens, the refresh rates and data transmission rates of these screens were very limited, so screen updates had to be carefully managed.

Line Editors

As a result, the first text editors on Unix implemented what is called line editing, and the standard line editor, ed is still shipped with POSIX systems. When the user starts ed, they are presented with a prompt where they can enter commands. The most basic command is the p command, which prints the current (or a range of) lines to the screen for viewing. The user can then use various other commands to move around the file, insert or delete lines, and perform other editing tasks.

Here are a few basic ed commands to illustrate the functionality of the editor:

n: Print the current line number and its contents.
a: Append text after the current line (as a new line).
i: Insert text before the current line (as a new line).
d: Delete the current line.
s: Substitute text in the current line (using regular expressions).
c: Change (replace) the current line.

While tedious, this method worked well on early terminals and teletype machines. A user could print out the contents or a range of lines with the n command, and then edit particular lines very easily with enough context, or even the full document, on a literal sheet of paper in front of them as a reference.

A line editing interface can be powerful and efficient for performing text editing tasks, but it can also be challenging to use for beginners, as it requires users to learn a series of commands and to understand how the commands operate on the text file, without actually being able to see much of what they are working on, or receive immediate feedback.

Although line editors are rarely used in most contexts, they still see use when working on embedded devices with limited memory resources, and they are sometimes used as well when working on remote or old machines over extremely slow or high latency connections where interactive editors are nauseatingly slow. Another common use-case is when writing text-processing scripts, since a program like ed can be used to perform many repetitive, sequenced operations that might be difficult using other tools.

Visual Editors

Advancements in terminal capabilities ushered in the development of visual, or display, editors that displayed real-time feedback to users as they edited text files. The vi editor is based heavily on its predecessor, ed, but with an added a visual interface and full-screen display functionality. vi implemented what is called a modal interface design, where users switch between two modes of operation. The first mode, command mode, presents a very similar interface to the one that ed provided, but with much more powerful commands. The other mode, insert mode, allows direct insertion of text into the document. Unlike the earlier line editors that either removed, replaced, or inserted entire lines at a time, the cursor could be moved to any arbitrary position within a line before inserting or removing text, allowing much more efficient editing.

Compared to modern IDEs and text editors, however, vi is fairly obsolete. Features like syntax highlighting, text selection, and mouse support are missing from the basic vi program. Like ed, however, it’s still regularly used on systems when no other options are available.

Eventually, vi was reimplemented and improved upon, bringing us vim (Vi IMproved). vim added additional modes, syntax highlighting, a powerful scripting language (vimscript), plugins, and much more. It continues to be the one of the most popular and widely used text editors and fully-fledged IDEs among systems programmers and system administrators today because of its powerful features, efficient operation, and the fact that it is available almost everywhere.

vim offers several different modes for performing different tasks:

Normal (n) mode: In normal mode, each key (or key combination) on the keyboard is mapped to different functions which can be used to move the cursor around, search the document, change line indentation, record and play back macros, and so on.
Insert (i) mode: Insert mode is used for entering text into a document, and is fairly intuitive to use. Some control-key combinations are recognized to perform special tasks, but otherwise this is the basic text editing functionality of vi, and where authors spend most of their time.
Visual (v) mode: Visual mode is used for selecting text in order to use that selection as the target of a command or function, such as copying and pasting, deleting, indenting, performing a search and replace, and so on. There are a few different visual modes for selecting lines of text, adjacent characters, or rectangular blocks of text.
Select (s) mode: Similar to visual mode, but the selection is immediately replaced with whatever text is typed following selection, rather than acting as the target of a command.
Command (c) mode: Command mode offers an advanced mechanism of executing commands that can’t be performed in normal mode. Command mode is used for performing regular expression operations such as search and replace, opening, saving, and closing files, remapping keys, and more.
Ex (e) mode: Ex mode opens the command history as an actual window where all of the other modes can be used to edit the text in the command history and current command line before it is executed. This enables entering multiple commands, editing and copying previous commands, and so on.
Replace (r) mode: Replace mode works similarly to insert mode, except that input directly overwrites existing characters rather than being inserted at the location of the cursor.

Of the different modes, users spend the majority of their time in just three: Normal, Insert, and Command mode. Visual mode is used less frequently for selecting text, while Select, Replace, and Ex mode are rarely used at all.

All key mappings are fully customizable, and many different configuration settings and available plugins allow users to set up their vim environment exactly how they want for whatever tasks they regularly work on. vim is so popular that it is available as a default-installed package on most systems alongside the basic vi and ed utilities. In this class, we will be using an even more modern version of this called neovim, which was rewritten from the ground up with a Lua-based scripting and plugin interface for more modern systems. neovim isn’t as widely available as a pre-installed package, such as on headless servers and embedded devices that developers have to remotely connect to, but it is extremely popular where it is available and the majority of developers have already switched to neovim on their personal computers.

Documentation Browsers

Documentation browsers are tools used in Unix-like systems to access and navigate documentation. The “manual pager” (man) command provides brief summaries of commands and utilities, while the GNU Info (info) utility offers a more structured and comprehensive way to organize and present documentation.

GNU Info

GNU Info is a documentation system that provides help for GNU software packages, such as bash, the GNU core utilities (coreutils), the GNU Compiler Collection (gcc), the GNU C library (glibc), and so on. It is an alternative to the traditional Unix man page system that provides a more comprehensive and structured way to organize and present documentation.

The Info system is designed to be navigated through a series of hierarchical menus and nodes, with each node containing information on a specific topic. The top-level menu presents a list of all available Info documents on the system, while individual Info documents are organized into sections and subsections. Each subsection contains a summary of the topic and a list of related nodes, which can be followed to navigate to other related topics.

Info documentation is written in a plain text format that can be easily converted to other formats such as HTML or PDF. It can also include hyperlinks, cross-references, and indices to make it easier to navigate and find information. In general, info pages are written in a conversational, verbose, free-form prose that is intended as a learning, rather than reference, material.

The GNU Info system includes a command-line utility called info that allows users to access and navigate the Info documentation on their system. Users can use the info command to search for specific topics, view the table of contents for a document, navigate between different nodes, and more.