17xie > Python 语言参考手册 > 2.4 字面值 Literals
背景:                 
[本书目录] [图书首页] [本书讨论区]  
链接地址:http://www.17xie.com/read-37413.html    注册17xie 一起来写书 实现您的出书梦想!


2.4 字面值 Literals

Literals are notations for constant values of some built-in types.

字面值是某些内建类型常值的表示法.



2.4.1 串字面值 String literals

String literals are described by the following lexical definitions:

串字面值由以下词法定义描述:

stringliteral  ::=  [stringprefix](shortstring | longstring)
stringprefix  ::=  "r" | "u" | "ur" | "R" | "U" | "UR" | "Ur" | "uR"
shortstring  ::=  "'" shortstringitem* "'" | '"' shortstringitem* '"'
longstring  ::=  "'''" longstringitem* "'''"
    | '"""' longstringitem* '"""'
shortstringitem  ::=  shortstringchar | escapeseq
longstringitem  ::=  longstringchar | escapeseq
shortstringchar  ::=  <any ASCII character except "\" or newline or the quote>
longstringchar  ::=  <any ASCII character except "\">
escapeseq  ::=  "\" <any ASCII character>
Download entire grammar as text.

One syntactic restriction not indicated by these productions is that whitespace is not allowed between the stringprefix and the rest of the string literal.

上面没有表示出来的一个句法限制是在stringprefix和串字面值之间不允许有空白.

In plain English: String literals can be enclosed in matching single quotes (') or double quotes ("). They can also be enclosed in matching groups of three single or double quotes (these are generally referred to as triple-quoted strings). The backslash (\) character is used to escape characters that otherwise have a special meaning, such as newline, backslash itself, or the quote character. String literals may optionally be prefixed with a letter "r" or "R"; such strings are called raw strings and use different rules for interpreting backslash escape sequences. A prefix of "u" or "U" makes the string a Unicode string. Unicode strings use the Unicode character set as defined by the Unicode Consortium and ISO 10646. Some additional escape sequences, described below, are available in Unicode strings. The two prefix characters may be combined; in this case, "u" must appear before "r".

以英语的方式描述:串是以单引号(')或双引号("), 它们也可以用成对的三个单引号和双引号(这叫做三重引用串), 反斜线\可以用于引用其它有特殊含义的字符, 例如新行, 反斜线本身, 或者引用字符.串字面值可选地可以以'u'和'U'开头, 这样它就是一个"原始串"了, 它在解释反斜线时有着不同的规则, 前缀有'u'和'U'的串是Unicode串, Unicode使用Unicode协会和ISO 10646定义的Unicode字符集. 其它一些在Unicode中有效的转义字符一会儿会提到. 这两个前缀可以组合使用, 但'u'必须在'r'之前.

In triple-quoted strings, unescaped newlines and quotes are allowed (and are retained), except that three unescaped quotes in a row terminate the string. (A ``quote'' is the character used to open the string, i.e. either ' or ".)

在三重引用串中, 未转义的新行和引用字符是允许的(并被保留),除非三个连续的引用字符中断了该串.(引用字符是用于引用字符串的字符, 如'和")

Unless an "r" or "R" prefix is present, escape sequences in strings are interpreted according to rules similar to those used by Standard C. The recognized escape sequences are:

如果一个'r'或'R'给出, 那么其含义就像标准C中的规则类似地解释, 承认的转义的字符如下:

Escape Sequence  Meaning  Notes 
\newline Ignored  
\\ Backslash (\)  
\' Single quote (')  
\" Double quote (")  
\a ASCII Bell (BEL)  
\b ASCII Backspace (BS)  
\f ASCII Formfeed (FF)  
\n ASCII Linefeed (LF)  
\N{name} Character named name in the Unicode database (Unicode only)  
\r ASCII Carriage Return (CR)  
\t ASCII Horizontal Tab (TAB)  
\uxxxx Character with 16-bit hex value xxxx (Unicode only) (1)
\Uxxxxxxxx Character with 32-bit hex value xxxxxxxx (Unicode only) (2)
\v ASCII Vertical Tab (VT)  
\ooo ASCII character with octal value ooo (3)
\xhh ASCII character with hex value hh (4)

Notes:

(1)
Individual code units which form parts of a surrogate pair can be encoded using this escape sequence.
(2)
Any Unicode character can be encoded this way, but characters outside the Basic Multilingual Plane (BMP) will be encoded using a surrogate pair if Python is compiled to use 16-bit code units (the default). Individual code units which form parts of a surrogate pair can be encoded using this escape sequence.
(3)
As in Standard C, up to three octal digits are accepted.
(4)
Unlike in Standard C, at most two hex digits are accepted.

Unlike Standard , all unrecognized escape sequences are left in the string unchanged, i.e., the backslash is left in the string. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken.) It is also important to note that the escape sequences marked as ``(Unicode only)'' in the table above fall into the category of unrecognized escapes for non-Unicode string literals.

不像标准C, 所有不能被解释的转义序列留在串不作改变, 即反斜线留在串中(这个行为在调试中有用: 如果输入出错, 这样可以很容易地判断出错), 也要注意, 上面仅仅在Unicode中才有效的转义序列,在非Unicode字面值中是无效的.

When an "r" or "R" prefix is present, a character following a backslash is included in the string without change, and all backslashes are left in the string. For example, the string literal r"\n" consists of two characters: a backslash and a lowercase "n". String quotes can be escaped with a backslash, but the backslash remains in the string; for example, r"\"" is a valid string literal consisting of two characters: a backslash and a double quote; r"\" is not a valid string literal (even a raw string cannot end in an odd number of backslashes). Specifically, a raw string cannot end in a single backslash (since the backslash would escape the following quote character). Note also that a single backslash followed by a newline is interpreted as those two characters as part of the string, not as a line continuation.

当给出'r'或'R'时, 跟随反斜线后面的字符不被改变, 并且所有制的反斜线字符都会留在串中.例如,串r"\n"由两个字符组成:一个反斜线的一个小写的'n'.引用字符可以用反斜线引用, 但反斜线会留在串中.比如r"\""是一个有效的串字面值(即使原始串不能以连续的奇数个反斜线结束). 另外, 原始不能以一个反斜线结束(因为反斜线会把后面的引用字符转义), 也要注意新行号前的反斜线是解释为串中的两个字符, 而不是作为续行处理.

When an "r" or "R" prefix is used in conjunction with a "u" or "U" prefix, then the \uXXXX escape sequence is processed while all other backslashes are left in the string. For example, the string literal ur"\u0062\n" consists of three Unicode characters: `LATIN SMALL LETTER B', `REVERSE SOLIDUS', and `LATIN SMALL LETTER N'. Backslashes can be escaped with a preceding backslash; however, both remain in the string. As a result, \uXXXX escape sequences are only recognized when there are an odd number of backslashes.



2.4.2 串字面值的连接 String literal concatenation

Multiple adjacent string literals (delimited by whitespace), possibly using different quoting conventions, are allowed, and their meaning is the same as their concatenation. Thus, "hello" 'world' is equivalent to "helloworld". This feature can be used to reduce the number of backslashes needed, to split long strings conveniently across long lines, or even to add comments to parts of strings, for example:

多个相邻的串字面值(由空白分隔), 可能使用不同的引用习惯, 是允许的, 并且它的含义在连接时是一样的行.因此, "hello""world"等价于"helloworld".这个待征可以用来减少原本要使用的反斜线的数目, 可以把一个长串分隔在多行上,下班甚至在串的某个部分加上注释, 例如:

re.compile("[A-Za-z_]"       # letter or underscore
           "[A-Za-z0-9_]*"   # letter, digit or underscore
          )

Note that this feature is defined at the syntactical level, but implemented at compile time. The `+' operator must be used to concatenate string expressions at run time. Also note that literal concatenation can use different quoting styles for each component (even mixing raw strings and triple quoted strings).

注意这个功能是定义在句法层次上的, 但是是在编译时实现的.在运行时连接串必须使用 "+"运算符. 并且不同的引用字符可以混用, 甚至可以将原始串与三重引用串混着用.



2.4.3 数值型的字面值 Numeric literals

There are four types of numeric literals: plain integers, long integers, floating point numbers, and imaginary numbers. There are no complex literals (complex numbers can be formed by adding a real number and an imaginary number).

存在有四种类型的数值型的字面值:普通整数,长整数, 浮点数和虚数.没有复数字面值(复数可以以一个实数加上一个虚数的形式给出)

Note that numeric literals do not include a sign; a phrase like -1 is actually an expression composed of the unary operator `-' and the literal 1.

注意数值型的字面值不包括符号(译注:正负号), 像-1实际上是个组合了一元运算符"-"和字面值1的表达式.


字数:8102    最后更新:8个月以前 [03-15 20:24]月落晨星 修改
本页编辑者:月落晨星  
[前一页]:2.3 标识符和关键字 I…  [后一页]:2.4.4 整数和长整数型…
[在本页中加入书签] [收藏本书] [推荐本书]
  17xie论坛 > 本书讨论区 > 本页评论   (共0条)
发表评论

用户名称 匿名发表
评论内容
验证码

关于我们 | 版权声明 | 免责声明 | 诚聘英才 | 联系我们 | 合作伙伴 | 友情链接 | 广告合作 | 提交意见
Copyright © 2007 17xie.com 互联网协同写书平台 京ICP备08002671号