文章来自《Python cookbook》.

翻译仅仅是为了个人学习,其它商业版权纠纷与此无关!

@.84@ [2004-09-27 09:11:33]

描述

...Walking Directory Trees

遍历目录树

Credit: Robin Parmar, Alex Martelli

问题 Problem

You need to examine a directory, or an entire directory tree rooted in a certain directory, and obtain a list of all the files (and optionally folders) that match a certain pattern.

需要遍历检查目录, 或者遍历以某目录为根目录的完整的目录树,获取符合特定模式的全部文件(或者符合条件的目录).

解决 Solution

os.path.walk is sufficient for this purpose, but we can pretty it up quite at bit:

使用os.path.walk可以很好解决这个问题,不过可以做的更好些:

   1 import os.path, fnmatch
   2 
   3 def listFiles(root, patterns='*', recurse=1, return_folders=0):
   4 
   5     # Expand patterns from semicolon-separated string to list           
   6     pattern_list = patterns.split(';')
   7     # Collect input and output arguments into one bunch
   8     class Bunch:
   9         def _ _init_ _(self, **kwds): self._ _dict_ _.update(kwds)
  10     arg = Bunch(recurse=recurse, pattern_list=pattern_list,
  11         return_folders=return_folders, results=[])
  12 
  13     def visit(arg, dirname, files):
  14         # Append to arg.results all relevant files (and perhaps folders)
  15         for name in files:
  16             fullname = os.path.normpath(os.path.join(dirname, name))                #目录规范化
  17             if arg.return_folders or os.path.isfile(fullname):                      #判断是否返回目录。 是否是文件
  18                 for pattern in arg.pattern_list:                                    #模式匹配用 "or" ,符合一个就ok
  19                     if fnmatch.fnmatch(name, pattern):
  20                         arg.results.append(fullname)                                #结果中添加文件名称
  21                         break
  22         # Block recursion if recursion was disallowed
  23         if not arg.recurse: files[:]=[]                               #把list中目录包含的文件/子目录置空,子目录没了哈 
  24 
  25     os.path.walk(root, visit, arg)
  26 
  27     return arg.results

讨论 Discussion

The standard directory-tree function os.path.walk is powerful and flexible, but it can be confusing to beginners. This recipe dresses it up in a listFiles function that lets you choose the root folder, whether to recurse down through subfolders, the file patterns to match, and whether to include folder names in the result list.

标准遍历目录树函数功能强,适应性好,不过初学者 :( 理解可能有些困难。上面的listFiles脚本封装了使用walk函数的复杂性: 传递一个根目录参数, 是否递归处理子目录的boolean参数,文件匹配模式,以及一个结果序列是否包含目录名称的boolean参数。

The file patterns are case-insensitive but otherwise Unix-style, as supplied by the standard fnmatch module, which this recipe uses. To specify multiple patterns, join them with a semicolon. Note that this means that semicolons themselves can't be part of a pattern.

脚本中文件名称匹配使用标准模块fnmatch提供的功能(#译注:与正则表达式不同,见文档),在类Unix的平台上区分大小写,其它平台上不区分,因此,代码功能中也是如此。 使用多个模式,要用";"分开, ";"本身不能作为匹配模式的一部分。

For example, you can easily get a list of all Python and HTML files in directory /tmp or any subdirectory thereof:

举个例子,可以轻松获得目录 /tmp以及下面子目录中包含的Python和HTML文件的序列表:

thefiles = listFiles('/tmp', '*.py;*.htm;*.html')

参考 See Also

Documentation for the os.path module in the Library Reference.

#译注:python 2.3中, 文档中说os.walk()更好用些,不过我没看。这个确实有点复杂,呵呵

#附 :

fnmatch:

fnmatch( filename, pattern) Test whether the filename string matches the pattern string, returning true or false. If the operating system is case-insensitive, then both parameters will be normalized to all lower- or upper-case before the comparison is performed. If you require a case-sensitive comparison regardless of whether that's standard for your operating system, use fnmatchcase() instead.

os.path :

walk( path, visit, arg)

Calls the function visit with arguments (arg, dirname, names) for each directory in the directory tree rooted at path (including path itself, if it is a directory). The argument dirname specifies the visited directory, the argument names lists the files in the directory (gotten from os.listdir(dirname)). The visit function may modify names to influence the set of directories visited below dirname, e.g., to avoid visiting certain parts of the tree. (The object referred to by names must be modified in place, using del or slice assignment.)

Note: Symbolic links to directories are not treated as subdirectories, and that walk() therefore will not visit them. To visit linked directories you must identify them with os.path.islink(file) and os.path.isdir(file), and invoke walk() as necessary. Note: The newer os.walk() generator supplies similar functionality and can be easier to use.

PyCkBk-4-19 (last edited 2009-12-25 07:12:54 by localhost)