7.19.6.2 The fscanf function
Synopsis
#include <stdio.h>
int fscanf(FILE * restrict stream,
const char * restrict format, ...);
Description
The fscanf function reads input from the stream pointed to by stream, under control of the string pointed to by format that specifies the admissible input sequences and how they are to be converted for assignment, using subsequent arguments as pointers to the objects to receive the converted input. If there are insufficient arguments for the format, the behavior is undefined. If the format is exhausted while arguments remain, the excess arguments are evaluated (as always) but are otherwise ignored.
The format shall be a multibyte character sequence, beginning and ending in its initial shift state. The format is composed of zero or more directives: one or more white-space characters, an ordinary multibyte character (neither % nor a white-space character), or a conversion specification. Each conversion specification is introduced by the character %. After the %, the following appear in sequence:
An optional assignment-suppressing character *.
An optional decimal integer greater than zero that specifies the maximum field width (in characters).
An optional length modifier that specifies the size of the receiving object.
A conversion specifier character that specifies the type of conversion to be applied.
The fscanf function executes each directive of the format in turn. If a directive fails, as detailed below, the function returns. Failures are described as input failures (due to the occurrence of an encoding error or the unavailability of input characters), or matching failures (due to inappropriate input).
A directive composed of white-space character(s) is executed by reading input up to the first non-white-space character (which remains unread), or until no more characters can be read.
A directive that is an ordinary multibyte character is executed by reading the next characters of the stream. If any of those characters differ from the ones composing the directive, the directive fails and the differing and subsequent characters remain unread. Similarly, if end-of-file, an encoding error, or a read error prevents a character from being read, the directive fails.
A directive that is a conversion specification defines a set of matching input sequences, as described below for each specifier. A conversion specification is executed in the following steps:
Input white-space characters (as specified by the isspace function) are skipped, unless the specification includes a [, c, or n specifier.[1]
An input item is read from the stream, unless the specification includes an n specifier. An input item is defined as the longest sequence of input characters which does not exceed any specified field width and which is, or is a prefix of, a matching input sequence.[2] The first character, if any, after the input item remains unread. If the length of the input item is zero, the execution of the directive fails; this condition is a matching failure unless end-of-file, an encoding error, or a read error prevented input from the stream, in which case it is an input failure.
Except in the case of a % specifier, the input item (or, in the case of a %n directive, the count of input characters) is converted to a type appropriate to the conversion specifier. If the input item is not a matching sequence, the execution of the directive fails: this condition is a matching failure. Unless assignment suppression was indicated by a *, the result of the conversion is placed in the object pointed to by the first argument following the format argument that has not already received a conversion result. If this object does not have an appropriate type, or if the result of the conversion cannot be represented in the object, the behavior is undefined.
The length modifiers and their meanings are:
- hh
Specifies that a following d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to signed char or unsigned char.
- h
Specifies that a following d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to short int or unsigned short int.
- l (ell)
Specifies that a following d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to long int or unsigned long int; that a following a, A, e, E, f, F, g, or G conversion specifier applies to an argument with type pointer to double; or that a following c, s, or [ conversion specifier applies to an argument with type pointer to wchar_t.
- ll (ell-ell)
Specifies that a following d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to long long int or unsigned long long int.
- j
Specifies that a following d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to intmax_t or uintmax_t.
- z
Specifies that a following d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to size_t or the corresponding signed integer type.
- t
Specifies that a following d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to ptrdiff_t or the corresponding unsigned integer type.
- L
Specifies that a following a, A, e, E, f, F, g, or G conversion specifier applies to an argument with type pointer to long double.
If a length modifier appears with any conversion specifier other than as specified above, the behavior is undefined.
The conversion specifiers and their meanings are:
- d
Matches an optionally signed decimal integer, whose format is the same as expected for the subject sequence of the strtol function with the value 10 for the base argument. The corresponding argument shall be a pointer to signed integer.
- i
Matches an optionally signed integer, whose format is the same as expected for the subject sequence of the strtol function with the value 0 for the base argument. The corresponding argument shall be a pointer to signed integer.
- o
Matches an optionally signed octal integer, whose format is the same as expected for the subject sequence of the strtoul function with the value 8 for the base argument. The corresponding argument shall be a pointer to unsigned integer.
- u
Matches an optionally signed decimal integer, whose format is the same as expected for the subject sequence of the strtoul function with the value 10 for the base argument. The corresponding argument shall be a pointer to unsigned integer.
- x
Matches an optionally signed hexadecimal integer, whose format is the same as expected for the subject sequence of the strtoul function with the value 16 for the base argument. The corresponding argument shall be a pointer to unsigned integer.
- a,e,f,g
Matches an optionally signed floating-point number, infinity, or NaN, whose format is the same as expected for the subject sequence of the strtod function. The corresponding argument shall be a pointer to floating.
- c
Matches a sequence of characters of exactly the number specified by the field width (1 if no field width is present in the directive).[3] If no l length modifier is present, the corresponding argument shall be a pointer to the initial element of a character array large enough to accept the sequence. No null character is added. If an l length modifier is present, the input shall be a sequence of multibyte characters that begins in the initial shift state. Each multibyte character in the sequence is converted to a wide character as if by a call to the mbrtowc function, with the conversion state described by an mbstate_t object initialized to zero before the first multibyte character is converted. The corresponding argument shall be a pointer to the initial element of an array of wchar_t large enough to accept the resulting sequence of wide characters. No null wide character is added.
- s
Matches a sequence of non-white-space characters.[3] If no l length modifier is present, the corresponding argument shall be a pointer to the initial element of a character array large enough to accept the sequence and a terminating null character, which will be added automatically. If an l length modifier is present, the input shall be a sequence of multibyte characters that begins in the initial shift state. Each multibyte character is converted to a wide character as if by a call to the mbrtowc function, with the conversion state described by an mbstate_t object initialized to zero before the first multibyte character is converted. The corresponding argument shall be a pointer to the initial element of an array of wchar_t large enough to accept the sequence and the terminating null wide character, which will be added automatically.
- [
Matches a nonempty sequence of characters from a set of expected characters (the scanset).[3] If no l length modifier is present, the corresponding argument shall be a pointer to the initial element of a character array large enough to accept the sequence and a terminating null character, which will be added automatically. If an l length modifier is present, the input shall be a sequence of multibyte characters that begins in the initial shift state. Each multibyte character is converted to a wide character as if by a call to the mbrtowc function, with the conversion state described by an mbstate_t object initialized to zero before the first multibyte character is converted. The corresponding argument shall be a pointer to the initial element of an array of wchar_t large enough to accept the sequence and the terminating null wide character, which will be added automatically. The conversion specifier includes all subsequent characters in the format string, up to and including the matching right bracket (]). The characters between the brackets (the scanlist) compose the scanset, unless the character after the left bracket is a circumflex (^), in which case the scanset contains all characters that do not appear in the scanlist between the circumflex and the right bracket. If the conversion specifier begins with [] or ⌈], the right bracket character is in the scanlist and the next following right bracket character is the matching right bracket that ends the specification; otherwise the first following right bracket character is the one that ends the specification. If a - character is in the scanlist and is not the first, nor the second where the first character is a ^, nor the last character, the behavior is implementation-defined.
- p
Matches an implementation-defined set of sequences, which should be the same as the set of sequences that may be produced by the %p conversion of the fprintf function. The corresponding argument shall be a pointer to a pointer to void. The input item is converted to a pointer value in an implementation-defined manner. If the input item is a value converted earlier during the same program execution, the pointer that results shall compare equal to that value; otherwise the behavior of the %p conversion is undefined.
- n
No input is consumed. The corresponding argument shall be a pointer to signed integer into which is to be written the number of characters read from the input stream so far by this call to the fscanf function. Execution of a %n directive does not increment the assignment count returned at the completion of execution of the fscanf function. No argument is converted, but one is consumed. If the conversion specification includes an assignment- suppressing character or a field width, the behavior is undefined.
- %
Matches a single % character; no conversion or assignment occurs. The complete conversion specification shall be %%.
If a conversion specification is invalid, the behavior is undefined.[4]
The conversion specifiers A, E, F, G, and X are also valid and behave the same as, respectively, a, e, f, g, and x.
Trailing white space (including new-line characters) is left unread unless matched by a directive. The success of literal matches and suppressed assignments is not directly determinable other than via the %n directive.
Returns
The fscanf function returns the value of the macro EOF if an input failure occurs before any conversion. Otherwise, the function returns the number of input items assigned, which can be fewer than provided for, or even zero, in the event of an early matching failure.
EXAMPLE 1
The call:
#include <stdio.h>
/* ... */
int n, i; float x; char name[50];
n = fscanf(stdin, "%d%f%s", &i, &x, name);
with the input line:
25 54.32E-1 thompson
will assign to n the value 3, to i the value 25, to x the value 5.432, and to name the sequence thompson\0.
EXAMPLE 2
The call:
#include <stdio.h>
/* ... */
int i; float x; char name[50];
fscanf(stdin, "%2d%f%*d %[0123456789]", &i, &x, name);
with input:
56789 0123 56a72
will assign to i the value 56 and to x the value 789.0, will skip 0123, and will assign to name the sequence 56\0. The next character read from the input stream will be a.
EXAMPLE 3
To accept repeatedly from stdin a quantity, a unit of measure, and an item name:
#include <stdio.h>
/* ... */
int count; float quant; char units[21], item[21];
do {
count = fscanf(stdin, "%f%20s of %20s", &quant, units, item);
fscanf(stdin,"%*⌈\n]");
} while (!feof(stdin) && !ferror(stdin));
If the stdin stream contains the following lines:
2 quarts of oil
-12.8degrees Celsius
lots of luck
10.0LBS of
dirt
100ergs of energy
the execution of the above example will be analogous to the following assignments:
quant = 2; strcpy(units, "quarts"); strcpy(item, "oil");
count = 3;
quant = -12.8; strcpy(units, "degrees");
count = 2; // "C" fails to match "o"
count = 0; // "l" fails to match "%f"
quant = 10.0; strcpy(units, "LBS"); strcpy(item, "dirt");
count = 3;
count = 0; // "100e" fails to match "%f"
count = EOF;
EXAMPLE 4
In:
#include <stdio.h>
/* ... */
int d1, d2, n1, n2, i;
i = sscanf("123", "%d%n%n%d", &d1, &n1, &n2, &d2);
the value 123 is assigned to d1 and the value 3 to n1. Because %n can never get an input failure the value of 3 is also assigned to n2. The value of d2 is not affected. The value 1 is assigned to i.
EXAMPLE 5
In these examples, multibyte characters do have a state-dependent encoding, and the members of the extended character set that consist of more than one byte each consist of exactly two bytes, the first of which is denoted here by a and the second by an uppercase letter, but are only recognized as such when in the alternate shift state. The shift sequences are denoted by (uparrow) and (downarrow), in which the first causes entry into the alternate shift state.
After the call:
#include <stdio.h>
/* ... */
char str[50];
fscanf(stdin, "a%s", str);
with the input line:
a(uparrow) X Y(downarrow) bc
str will contain (uparrow) X Y(downarrow)\0 assuming that none of the bytes of the shift sequences (or of the multibyte characters, in the more general case) appears to be a single-byte white-space character.
In contrast, after the call:
#include <stdio.h>
#include <stddef.h>
/* ... */
wchar_t wstr[50];
fscanf(stdin, "a%ls", wstr);
with the same input line, wstr will contain the two wide characters that correspond to X and Y and a terminating null wide character.
However, the call:
#include <stdio.h>
#include <stddef.h>
/* ... */
wchar_t wstr[50];
fscanf(stdin, "a(uparrow) X(downarrow)%ls", wstr);
with the same input line will return zero due to a matching failure against the (downarrow) sequence in the format string.
Assuming that the first byte of the multibyte character X is the same as the first byte of the multibyte character Y, after the call:
#include <stdio.h>
#include <stddef.h>
/* ... */
wchar_t wstr[50];
fscanf(stdin, "a(uparrow) Y(downarrow)%ls", wstr);
with the same input line, zero will again be returned, but stdin will be left with a partially consumed multibyte character.
Forward References
Footnotes