File I/O

Unlike the standard i/o library, POSIX really only provides two system calls for reading and writing files–read() and write()–exposed by the uinstd.h header. These two methods provide a simple, uniform interface for reading and writing data across a variety of file targets. Internally, the kernel contains separate implementations of these system calls for each type of file and each file system, and it delegates incoming read/write requests to the appropriate driver code for each file.

The function signatures are nearly identical,

  • ssize_t read(int fd, void *buf, size_t count);

  • ssize_t write(int fd, void const *buf, size_t count);

Each function transfers count bytes between the file specified by fd and the buffer specified by buf, and returns the number of bytes successfully transferred. If an error occurs, each function sets errno to an appropriate value and returns -1.

One important difference between the system calls and their standard i/o counterparts, fread() and fwrite(), is that the standard i/o methods return a short count only when an error or end-of-file occurs. On the other hand, system calls can successfully read or write fewer bytes than requested. This may be caused by internal limitations on the amount of data transferred in a single request, interruptions, file system limitations, or other issues.

Writing Data

When writing data, the return value is always either -1, to indicate an error, or a positive integer specifying the number of bytes written. Since the number written may be fewer than requested, it is necessary to use a loop, and keep track of the number of bytes successfully written. This is such a commonplace idiom, that programmers often implement their own functions to abstract away this loop,

size_t
write_all(int fd, void const *buf, size_t count)
{
  size_t offset = 0;
  while (offset < count) {
    ssize_t ret_val = write(fd, buf + offset, count - offset);
    if (ret_val < 0) return offset; /* Error */
    offset += ret_val;
  }
  return offset;
}

Notice that a short item count is returned in the event of an error.

Reading Data

Reading is similar to writing, except that the special return value of 0 always unambiguously indicates end-of-file. For regular text files, this has an obvious meaning; for other types of files, such as device files, a return value of 0 has an application-specific meaning that is supposed to signify “end of input”. For example, an interactive terminal normally blocks on read() until input is available–which occurs when the user enters a newline or presses ctrl-d to manually flush the input buffer. When the user presses ctrl-d on an empty input buffer, the terminal driver allows a blocked read() call to immediately return with a value of 0, essentially emulating an end of file condition. This allows generic programs to read input from a variety of file types, because they all implement the same interface at the system call level.

A similar wrapper can be written for read,

size_t
read_all(int fd, void *buf, size_t count)
{
  size_t offset = 0;
  while (offset < count) {
    ssize_t ret_val = read(fd, buf + offset, count - offset);
    if (ret_val <= 0) return offset; /* Error or eof */
    offset += ret_val;
  }
  return offset;
}

Notice that, as before, a short count is returned in the event of an error; however, a short item count is also returned if end-of-file is reached. How can these conditions be differentiated by the caller? This can be accomplished by storing the value of errno and then setting it to zero prior to the call. By checking errno afterwards, it can be determined whether a short count is a result of an error or end-of-file,

int old_errno = errno;
errno = 0;
size_t n_read = read_all(/*...*/, count);
if (nread < count) {
  if (errno) {
    /* Error occurred */
  } else {
    /* End of file */
  }
}
errno = old_errno;

When an error condition can be indicated in the return value of a function, this is called in-band signalling, and it is generally preferred when possible since it is faster to check a return value rather than save, check, and restore a global variable like errno. In cases where it is not possible because all possible return values are reserved as valid, as in the above example, out-of-band signalling is used instead. Many library functions and POSIX system calls resort to out-of-band signalling when necessary, generally by setting errno, which is indicated in their respective documentation.

Repositioning

Recall the following standard i/o functions:

  • int fseek(FILE *stream, long offset, int whence);

  • long ftell(FILE *stream);

POSIX provides a nearly identical interface through the off_t lseek(int fd, off_t offset, int whence);. Aside from taking a file descriptor number rather than a FILE* stream parameter, the major difference is that lseek returns the current offset–combining the features of both fseek and ftell into one function. Both lseek and fseek use the same SEEK_SET, SEEK_CUR, and SEEK_END macros for their whence parameters.