PyRe - Woodpecker Wiki for Pythonic

正在使用正则表达式，随手翻译了一正python的文档
::-- ZoomQuiet [2005-04-28 04:15:10]
日期: 2005-4-28 上午11:08
主题: [python-chinese] 正在使用正则表达式，随手翻 译了一正python的文档
回复 | 回复所有人 | 转发 | 打印 | 将发件人添加到通讯录 | 删除该邮件 | 这是网络欺诈 | 显示原始邮件
大部分与其它语言中的规则一致，但是也有部分不同的地方，手头有个工作要用到正则表达式，就随手翻译了一了python的帮助文档。组织的不是很正规。看懂是没有问题的。

###########################################################
特殊字符:
###########################################################
   "."      匹配除 "\n" 之外的任何单个字符。要匹配包括 '\n' 在内的任何字符，请使用象 '[.\n]' 的模式。
   "^"      匹配输入字符串的开始位置。
   "$"      匹配输入字符串的结束位置。
   "*"      匹配前面的子表达式零次或多次。例如，zo* 能匹配 "z" 以及"zoo"。 * 等价于{0,}。 Greedy means 贪婪的
   "+"      匹配前面的子表达式一次或多次。例如，'zo+' 能匹配 "zo" 以及 "zoo"，但不能匹配 "z"。+ 等价于 {1,}。
   "?"      匹配前面的子表达式零次或一次(贪婪的)
   *?,+?,?? 前面三个特殊字符的非贪婪版本
   {m,n}    最少匹配 m 次且最多匹配 n 次(m 和 n 均为非负整数，其中m <= n。)
   {m,n}?   上面表达式的非贪婪版本.
   "\\"      Either escapes special characters or signals a special sequence.
   []       表示一个字符集合，匹配所包含的任意一个字符
            第一个字符是 "^" 代表这是一个补集
   "|"      A|B, 匹配 A 或 B中的任一个
   (...)    Matches the RE inside the parentheses（圆括号）.（匹配pattern 并获取这一匹配）
            The contents can be retrieved（找回） or matched later in the string.
   (?iLmsux) 设置 I, L, M, S, U, or X 标记 (见下面).
   (?:...)  圆括号的非成组版本.
   (?P<name>...) 被组（group）匹配的子串，可以通过名字访问
   (?P=name) 匹配被组名先前匹配的文本（Matches the text matched earlier by the
group named name.）
   (?#...)  注释；被忽略.
   (?=...)  Matches if ... matches next, but doesn't consume the
string（但是并不消灭这个字串.）
   (?!...)  Matches if ... doesn't match next.

The special sequences consist of "\\" and a character from the list
below.  If the ordinary character is not on the list, then the
resulting RE will match the second character.
   \number  Matches the contents of the group of the same number.
   \A       Matches only at the start of the string.
   \Z       Matches only at the end of the string.
   \b       Matches the empty string, but only at the start or end of a word
                                       匹配一个空串但只在一个单词的开始或者结束的地方.匹配单词的边界
   \B       匹配一个空串, 但不是在在一个单词的开始或者结束的地方.（匹配非单词边界）
   \d       匹配一个数字字符。等价于 [0-9]。
   \D       匹配一个非数字字符。等价于 [^0-9]。
   \s       匹配任何空白字符，包括空格、制表符、换页符等等。等价于[ \f\n\r\t\v]。
   \S       匹配任何非空白字符。等价于 [^ \f\n\r\t\v]。
   \w       匹配包括下划线的任何单词字符。等价于'[A-Za-z0-9_]'.
            With LOCALE, it will match the set [0-9_] plus characters defined
            as letters for the current locale.
   \W       匹配\w的补集（匹配任何非单词字符。等价于 '[^A-Za-z0-9_]'。）
   \\       匹配一个"\"(反斜杠)

##########################################################
共有如下方法可以使用：
##########################################################
   match    从一个字串的开始匹配一个正则表达式
   search   搜索匹配正则表达式的一个字串
   sub      替换在一个字串中发现的匹配模式的字串
   subn     同sub，但是返回替换的个数
   split    用出现的模式分割一个字串
   findall  Find all occurrences of a pattern in a string.
   compile  把一个模式编译为一个RegexObject对像.
   purge                       清除正则表达式缓存
   escape   Backslash（反斜杠）all non-alphanumerics in a string.

Some of the functions in this module takes flags as optional parameters:
   I  IGNORECASE  Perform case-insensitive matching.（执行大小写敏感的匹配）
   L  LOCALE      Make \w, \W, \b, \B, dependent on the current locale.
   M  MULTILINE   "^" matches the beginning of lines as well as the string.
                  "$" matches the end of lines as well as the string.
   S  DOTALL      "." matches any character at all, including the newline（换行符）.
   X  VERBOSE     Ignore whitespace and comments for nicer looking RE's.
   U  UNICODE     Make \w, \W, \b, \B, dependent on the Unicode locale.

This module also defines an exception 'error'.

compile(pattern, flags=0)
返回一个模式对像
Compile a regular expression pattern, returning a pattern object.

escape(pattern)
Escape all non-alphanumeric characters in pattern.

findall(pattern, string)
如果出现一个或多个匹配，返回所有组的列表；这个列表将是元组的列表。
空匹配也在返回值中
Return a list of all non-overlapping（不相重叠的） matches in the string.
If one or more groups are present in the pattern, return a
list of groups; this will be a list of tuples if the pattern
has more than one group.
Empty matches are included in the result.

finditer(pattern, string)
返回一个指示器（iterator）；每匹配一次，指示器返回一个匹配对像。
空匹配也在返回值中
Return an iterator over all non-overlapping matches in the
string.  For each match, the iterator returns a match object.
Empty matches are included in the result.

match(pattern, string, flags=0)
返回一个匹配的对像，如果没有匹配的，返回一个None
Try to apply the pattern at the start of the string, returning
a match object, or None if no match was found.

purge()
Clear the regular expression cache

search(pattern, string, flags=0)
返回一个匹配的对像，如果没有匹配的，返回一个None
Scan through string looking for a match to the pattern, returning
a match object, or None if no match was found.

split(pattern, string, maxsplit=0)
返回一个包含结果字串的列表
Split the source string by the occurrences of the pattern,
returning a list containing the resulting substrings.

sub(pattern, repl, string, count=0)
返回一个字串，最左边被不重叠的用"repl"替换了。
Return the string obtained by replacing the leftmost
non-overlapping occurrences of the pattern in string by the
replacement repl

subn(pattern, repl, string, count=0)
返回一个包含(new_string, number)的2元组；number是替换的次数
Return a 2-tuple containing (new_string, number).
new_string is the string obtained by replacing the leftmost
non-overlapping occurrences of the pattern in the source
string by the replacement repl.  number is the number of
substitutions that were made.

template(pattern, flags=0)
返回一个模式对像
Compile a template pattern, returning a pattern object

_______________________________________________
python-chinese list
[email protected]
http://python.cn/mailman/listinfo/python-chinese