Character i/o

Character i/o is used for reading and writing individual bytes, one at a time, and can be used with raw text or raw byte data.

Input

Individual bytes may be read from an arbitrary stream by the int fgetc(FILE *stream) function, which returns an unsigned char byte, upcast to int. This cast is performed so that the special EOF (end-of-file) indicator can be returned as an error indicator without being confused for a valid byte value; EOF is a macro that expands to a negative value (typically -1).

A common programming bug is to store the result of fgetc into a character type, which is a narrowing cast. Consider the following code,

T c = fgetc(stdin);
if (c < 0) {
  /* Error */
}

Suppose that,

  • T is type unsigned char: If fgetc returns EOF, the negative value will be converted to a valid, positive byte value (typically, 255). The if statement will never detect errors or end of input.

  • T is type signed char: If fgetc returns a value greater than SCHAR_MAX, which is half of the possible byte values, then the behavior is implementation-defined; it may raise a signal, or produce an unexpected result.

  • T is type char: Either of the above situations will occur, depending on whether char is signed or unsigned.

Therefore, it is very important to always use an int to store the return value of fgetc. If the return value is not EOF, then it is safe to convert it to an unsigned char if desired,

int ret = fgetc(stdin);
if (ret < 0) {
  /* Error */
} else {
  unsigned char c = ret; /* Ok */
}

Tip

Notice that the comparison used is c < 0, rather than c == EOF. On most architectures, it is more efficient to compare a value to zero than it is to compare it to any other value, since the latter requires encoding that comparison value into an instruction. On AMD64, for example, comparisons against zero always compile to two-byte test instructions, while comparisons against other values compile to three-byte cmp instructions with an immediate operand. Skilled C programmers recognize and take advantage of these opportunities for free optimizations.

The int getchar(void); function is equivalent to calling fgetc with stdin as its argument, but is generally 30-50% faster due to being optimized specifically for stdin.

The int getc(FILE *stream); function is equivalent to fgetc, except that it, if it is implemented as a macro, it may evaluate stream more than once, so its stream argument should never be an expression with side effects. Practically, these functions are the same in modern implementations, leaving this as a historical artefact.

Peeking

The int ungetc(int c, FILE *stream); function may be used to push back a character to the specified stream so that it may be subsequently read; the character is pushed back to an internal buffer and does not affect actual external storage. It is guaranteed that one character may be pushed back–pushing back additional characters may be supported, but the call may also fail and return EOF. Repositioning or writing to the stream discards any pushed back characters that haven’t been re-read yet. This function is typically used for look-ahead parsers; example,

int peekc(FILE *stream)
{
  int c = fgetc(stream);
  ungetc(c, stream);
  return c;
}

Output

The int fputc(int c, FILE *stream);, int putchar(int c);, and int putc(int c, FILE *stream) functions are all the output variant of the fgetc, getchar, and getc input functions, respectively. Each writes c, cast as an unsigned char to stream; putchar is equivalent to fputc with stdout as its stream argument, and putc may be a macro that evaluates stream multiple times. The return value is the character written, or EOF on error, just as before.