17xie > Python 语言参考手册 > 2. 词法分析 Lexical analysis
背景:                 
[本书目录] [图书首页] [本书讨论区]  
链接地址:http://www.17xie.com/read-37400.html    注册17xie 一起来写书 实现您的出书梦想!

2. 词法分析 Lexical analysis

A Python program is read by a parser. Input to the parser is a stream of tokens, generated by the lexical analyzer. This chapter describes how the lexical analyzer breaks a file into tokens.

一个Python程序由解析器读入, 输入解析器的是一个语言符号流, 由词法分析器生成.本章讨论词法分析器是如何把文件分隔成语言符号的.

Python uses the 7-bit ASCII character set for program text. New in version 2.3: An encoding declaration can be used to indicate that string literals and comments use an encoding different from ASCII.. For compatibility with older versions, Python only warns if it finds 8-bit characters; those warnings should be corrected by either declaring an explicit encoding, or using escape sequences if those bytes are binary data, instead of characters.

Python使用7比特长的ASCII字符集作为程序文本和串字面值. 8比特长的字符的也可以作串字面值和注释, 但它们的解释是依赖于平台的, 在串中插入八比特字符的正确方法是使用八进制数和十六进制数的转义字符.

The run-time character set depends on the I/O devices connected to the program but is generally a superset of ASCII.

运行时字符集依赖于连接到程序的I/O设备, 但通常是ASCII的超集.

Future compatibility note: It may be tempting to assume that the character set for 8-bit characters is ISO Latin-1 (an ASCII superset that covers most western languages that use the Latin alphabet), but it is possible that in the future Unicode text editors will become common. These generally use the UTF-8 encoding, which is also an ASCII superset, but with very different use for the characters with ordinals 128-255. While there is no consensus on this subject yet, it is unwise to assume either Latin-1 or UTF-8, even though the current implementation appears to favor Latin-1. This applies both to the source character set and the run-time character set.

向后兼容性备忘: 假定8位字符集是ISO Latin-1(一种ASCII码的超集,它覆盖了大部分使用拉丁字母的西方语言.)看起来是个不错的做法, 但是未来可能是支持Unicode的编辑器更流行一些, 通常使用UTF-8(另一种ASCII码的超集)编码, 但是对于顺序在128到255之间的字符用法两者存在很大的区别。然而关于这点还没有一致的意见,假定为Latin-1或UTF-8都是不明智的,尽管当前的实现偏向于Latin-1, 这一点对于源程序字符集和运行字符集都是适用的。


字数:1890    最后更新:8个月以前 [03-15 20:07]月落晨星 修改
本页编辑者:月落晨星  
[前一页]:1.1 记法 Notation  [后一页]:2.1 行结构 Line stru…
[在本页中加入书签] [收藏本书] [推荐本书]
  17xie论坛 > 本书讨论区 > 本页评论   (共0条)
发表评论

用户名称 匿名发表
评论内容
验证码

关于我们 | 版权声明 | 免责声明 | 诚聘英才 | 联系我们 | 合作伙伴 | 友情链接 | 广告合作 | 提交意见
Copyright © 2007 17xie.com 互联网协同写书平台 京ICP备08002671号