6.4.5 String literals
Syntax
string-literal:
“ s-char-sequenceopt “
L” s-char-sequenceopt “
s-char-sequence:
s-char
s-char-sequence s-char
s-char:
any member of the source character set except the double-quote
“
, backslash
\
, or
new-line
character
escape-sequence
Description
A character string literal is a sequence of zero or more multibyte characters enclosed in double-quotes, as in “xyz”. A wide string literal is the same, except prefixed by the letter L.
The same considerations apply to each element of the sequence in a character string literal or a wide string literal as if it were in an integer character constant or a wide character constant, except that the single-quote ‘ is representable either by itself or by the escape sequence \’, but the double-quote ” shall be represented by the escape sequence \”.
Semantics
In translation phase 6, the multibyte character sequences specified by any sequence of adjacent character and wide string literal tokens are concatenated into a single multibyte character sequence. If any of the tokens are wide string literal tokens, the resulting multibyte character sequence is treated as a wide string literal; otherwise, it is treated as a character string literal.
In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals.[1] The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence; for wide string literals, the array elements have type wchar_t, and are initialized with the sequence of wide characters corresponding to the multibyte character sequence, as defined by the mbstowcs function with an implementation-defined current locale. The value of a string literal containing a multibyte character or escape sequence not represented in the execution character set is implementation-defined.
It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.
EXAMPLE
This pair of adjacent character string literals
"\x12" "3"
produces a single character string literal containing the two characters whose values are '\x12' and '3', because escape sequences are converted into single members of the execution character set just prior to adjacent string literal concatenation.
Forward References
Footnotes