2006/12/18

raw string

Raw string literals are parsed in exactly the same way as ordinary string literals; it's just the conversion from string literal to string object that's different. This means that all string literals must end with an even number of backslashes; otherwise, the unpaired backslash at the end escapes the closing quote character, leaving an unterminated string.

processors consider an unmatched trailing backslash to be an error anyway, so raw strings allow you to pass on the string quote character by escaping it with a backslash.

to match a literal backslash, one might have to write '\\\\' as the pattern string, because the regular expression must be "\\", and each backslash must be expressed as "\\" inside a regular Python string literal.

The solution is to use Python's raw string notation for regular expression patterns; backslashes are not handled in any special way in a string literal prefixed with "r". So r"\n" is a two-character string containing "\" and "n", while "\n" is a one-character string containing a newline.

>>> r'ss\n'
'ss\\n'
>>> 'ss\n'
'ss\n'
>>> r'ss\n'
'ss\\n'
>>> 'ss\n'
'ss\n'
>>> 'sss\'
File "", line 1
'sss\'
^
SyntaxError: EOL while scanning single-quoted string

>>> r'sss\'
File "", line 1
r'sss\'
^
SyntaxError: EOL while scanning single-quoted string

沒有留言:

張貼留言