Literals are notations for constant values of some built-in types.
字面值是某些内建类型常值的表示法.
String literals are described by the following lexical definitions:
串字面值由以下词法定义描述:
stringliteral |
::= | [stringprefix](shortstring
|
longstring) |
stringprefix |
::= | "r" | "u" | "ur" | "R" | "U" | "UR" | "Ur" |
"uR" |
shortstring |
::= | "'"
shortstringitem* "'" | '"'
shortstringitem* '"' |
longstring |
::= | "'''"
longstringitem* "'''" |
| '"""'
longstringitem* '"""' |
||
shortstringitem |
::= |
shortstringchar |
escapeseq |
longstringitem |
::= |
longstringchar |
escapeseq |
shortstringchar |
::= | <any ASCII character except "\" or newline or the
quote> |
longstringchar |
::= | <any ASCII character except "\"> |
escapeseq |
::= | "\" <any ASCII character> |
One syntactic restriction not indicated by these productions is that whitespace is not allowed between the stringprefix and the rest of the string literal.
上面没有表示出来的一个句法限制是在stringprefix和串字面值之间不允许有空白.
In plain English: String literals can be enclosed in matching
single quotes (') or double quotes (").
They can also be enclosed in matching groups of three single or
double quotes (these are generally referred to as triple-quoted
strings). The backslash (\) character is used to
escape characters that otherwise have a special meaning, such as
newline, backslash itself, or the quote character. String literals
may optionally be prefixed with a letter "r" or "R"; such strings
are called raw strings
and use different rules for interpreting backslash
escape sequences. A prefix of "u" or
"U" makes the string a Unicode string.
Unicode strings use the Unicode character set as defined by the
Unicode Consortium and ISO 10646. Some additional escape
sequences, described below, are available in Unicode strings. The
two prefix characters may be combined; in this case, "u" must appear before "r".
以英语的方式描述:串是以单引号(')或双引号("), 它们也可以用成对的三个单引号和双引号(这叫做三重引用串),
反斜线\可以用于引用其它有特殊含义的字符, 例如新行, 反斜线本身,
或者引用字符.串字面值可选地可以以'u'和'U'开头, 这样它就是一个"原始串"了, 它在解释反斜线时有着不同的规则,
前缀有'u'和'U'的串是Unicode串, Unicode使用Unicode协会和ISO 10646定义的Unicode字符集.
其它一些在Unicode中有效的转义字符一会儿会提到. 这两个前缀可以组合使用, 但'u'必须在'r'之前.
In triple-quoted strings, unescaped newlines and quotes are
allowed (and are retained), except that three unescaped quotes in a
row terminate the string. (A ``quote'' is the character used to
open the string, i.e. either ' or ".)
在三重引用串中, 未转义的新行和引用字符是允许的(并被保留),除非三个连续的引用字符中断了该串.(引用字符是用于引用字符串的字符, 如'和")
Unless an "r" or "R" prefix is present, escape sequences in strings are interpreted according to rules similar to those used by Standard C. The recognized escape sequences are:
如果一个'r'或'R'给出, 那么其含义就像标准C中的规则类似地解释, 承认的转义的字符如下:
| Escape Sequence | Meaning | Notes |
|---|---|---|
\newline |
Ignored | |
\\ |
Backslash (\) |
|
\' |
Single quote (') |
|
\" |
Double quote (") |
|
\a |
ASCII Bell (BEL) | |
\b |
ASCII Backspace (BS) | |
\f |
ASCII Formfeed (FF) | |
\n |
ASCII Linefeed (LF) | |
\N{name} |
Character named name in the Unicode database (Unicode only) | |
\r |
ASCII Carriage Return (CR) | |
\t |
ASCII Horizontal Tab (TAB) | |
\uxxxx |
Character with 16-bit hex value xxxx (Unicode only) | (1) |
\Uxxxxxxxx |
Character with 32-bit hex value xxxxxxxx (Unicode only) | (2) |
\v |
ASCII Vertical Tab (VT) | |
\ooo |
ASCII character with octal value ooo | (3) |
\xhh |
ASCII character with hex value hh | (4) |
Notes:
Unlike Standard , all unrecognized escape sequences are left in the string unchanged, i.e., the backslash is left in the string. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken.) It is also important to note that the escape sequences marked as ``(Unicode only)'' in the table above fall into the category of unrecognized escapes for non-Unicode string literals.
不像标准C, 所有不能被解释的转义序列留在串不作改变, 即反斜线留在串中(这个行为在调试中有用: 如果输入出错, 这样可以很容易地判断出错), 也要注意, 上面仅仅在Unicode中才有效的转义序列,在非Unicode字面值中是无效的.
When an "r" or "R" prefix is present, a character following a
backslash is included in the string without change, and all
backslashes are left in the string. For example, the string
literal r"\n" consists of two characters: a backslash
and a lowercase "n". String quotes can
be escaped with a backslash, but the backslash remains in the
string; for example, r"\"" is a valid string literal
consisting of two characters: a backslash and a double quote;
r"\" is not a valid string literal (even a raw string
cannot end in an odd number of backslashes). Specifically, a raw
string cannot end in a single backslash (since the backslash
would escape the following quote character). Note also that a
single backslash followed by a newline is interpreted as those two
characters as part of the string, not as a line
continuation.
当给出'r'或'R'时, 跟随反斜线后面的字符不被改变,
并且所有制的反斜线字符都会留在串中.例如,串r"\n"由两个字符组成:一个反斜线的一个小写的'n'.引用字符可以用反斜线引用,
但反斜线会留在串中.比如r"\""是一个有效的串字面值(即使原始串不能以连续的奇数个反斜线结束). 另外,
原始不能以一个反斜线结束(因为反斜线会把后面的引用字符转义), 也要注意新行号前的反斜线是解释为串中的两个字符,
而不是作为续行处理.
When an "r" or "R" prefix is used in conjunction with a
"u" or "U"
prefix, then the \uXXXX escape sequence is processed
while all other backslashes are left in the string. For
example, the string literal ur"\u0062\n" consists of
three Unicode characters: `LATIN SMALL LETTER B', `REVERSE
SOLIDUS', and `LATIN SMALL LETTER N'. Backslashes can be escaped
with a preceding backslash; however, both remain in the string. As
a result, \uXXXX escape sequences are only recognized
when there are an odd number of backslashes.
Multiple adjacent string literals (delimited by whitespace),
possibly using different quoting conventions, are allowed, and
their meaning is the same as their concatenation. Thus,
"hello" 'world' is equivalent to
"helloworld". This feature can be used to reduce the
number of backslashes needed, to split long strings conveniently
across long lines, or even to add comments to parts of strings, for
example:
多个相邻的串字面值(由空白分隔), 可能使用不同的引用习惯, 是允许的, 并且它的含义在连接时是一样的行.因此, "hello""world"等价于"helloworld".这个待征可以用来减少原本要使用的反斜线的数目, 可以把一个长串分隔在多行上,下班甚至在串的某个部分加上注释, 例如:
re.compile("[A-Za-z_]" # letter or underscore
"[A-Za-z0-9_]*" # letter, digit or underscore
)
Note that this feature is defined at the syntactical level, but implemented at compile time. The `+' operator must be used to concatenate string expressions at run time. Also note that literal concatenation can use different quoting styles for each component (even mixing raw strings and triple quoted strings).
注意这个功能是定义在句法层次上的, 但是是在编译时实现的.在运行时连接串必须使用 "+"运算符. 并且不同的引用字符可以混用, 甚至可以将原始串与三重引用串混着用.
There are four types of numeric literals: plain integers, long integers, floating point numbers, and imaginary numbers. There are no complex literals (complex numbers can be formed by adding a real number and an imaginary number).
存在有四种类型的数值型的字面值:普通整数,长整数, 浮点数和虚数.没有复数字面值(复数可以以一个实数加上一个虚数的形式给出)
Note that numeric literals do not include a sign; a phrase like
-1 is actually an expression composed of the unary
operator `-' and the literal 1.
注意数值型的字面值不包括符号(译注:正负号), 像-1实际上是个组合了一元运算符"-"和字面值1的表达式.