ObpLovelyPython/LpyAttAnswerCdays - Woodpecker Wiki for CPUG

status

校对

lizzie

完成度100%

Contents

CDays-5

计算今年是闰年嘛?判断闰年条件, 满足年份模400为0, 或者模4为0但模100不为0.
- 源代码
```
   1 #coding:utf-8
   2 '''cdays-5-exercise-1.py 判断今年是否是闰年
   3     @note: 使用了import, time模块, 逻辑分支, 字串格式化等
   4 '''
   5 
   6 import time                             #导入time模块
   7 thisyear = time.localtime()[0]             #获取当前年份
   8 if thisyear % 400 == 0 or thisyear % 4 ==0 and thisyear % 100 <> 0: #判断闰年条件, 满足模400为0, 或者模4为0但模100不为0
   9     print 'this year %s is a leap year' % thisyear
  10 else:
  11     print 'this year %s is not a leap year' % thisyear
```
- 运行截屏

利用python作为科学计算器。熟悉Python中的常用运算符，并分别求出表达式12*34+78-132/6、(12*(34+78)-132)/6、(86/40)**5的值。并利用math模块进行数学计算，分别求出145/23的余数，0.5的sin和cos值（注意sin和cos中参数是弧度制表示）提醒:可通过import math; help("math")查看math帮助.

源代码

   1 #coding:utf-8
   2 '''cdays-5-exercise-2.py 求表达式的值
   3     @note: 基本表达式运算, 格式化输出, math模块
   4     @see: math模块使用可参考http://docs.python.org/lib/module-math.html
   5 '''
   6 
   7 x = 12*34+78-132/6      #表达式计算
   8 y = (12*(34+78)-132)/6
   9 z = (86/40)**5
  10 
  11 print '12*34+78-132/6 = %d' % x
  12 print '(12*(34+78)-132)/6 = %d' % y
  13 print '(86/40)**5 = %f' % z
  14 
  15 import math             #导入数学计算模块
  16 
  17 a = math.fmod(145, 23)  #求余函式
  18 b = math.sin(0.5)       #正弦函式
  19 c = math.cos(0.5)       #余弦函式
  20 
  21 print '145/23的余数 = %d' % a
  22 print 'sin(0.5) = %f' %b
  23 print 'cos(0.5) = %f' %c

运行截屏

找出0~100之间的所有素数。

源代码

   1 #coding:utf-8
   2 '''cdays-5-exercise-3.py 求0~100之间的所有素数
   3     @note: for循环, 列表类型
   4     @see: math模块使用可参考http://docs.python.org/lib/module-math.html
   5 '''
   6 
   7 from math import sqrt
   8 
   9 N = 100
  10 #基本的方法
  11 result1 = []
  12 for num in range(2, N):
  13     f = True
  14     for snu in range(2, int(sqrt(num))+1):
  15         if num % snu == 0:
  16            f = False
  17            break
  18     if f:
  19         result1.append(num)
  20 print result1
  21 
  22 #更好的方法
  23 result2 = [ p for p in range(2, N) if 0 not in [ p% d for d in range(2, int(sqrt(p))+1)] ]
  24 print result2

运行截屏

CDays-4

os 模块中还有哪些功能可以使用? -- 提示使用 dir()和help()
- os模块中还有很多功能，主要的有以下些：
  - os.error, os.path, os.popen, os.stat_result, os.sys, os.system等等等,详细可参见dir("os")和Python帮助文档help("os")
open() 还有哪些模式可以使用?
- open()有以下几种模式:
  - 'r': 以只读方式打开已存在文件，若文件不存在则抛出异常。此方式是默认方式
  - 'U'或者'rU': Python惯例构造了通用换行支持;提供'U'模式以文本方式打开一个文件,但是行可能随时结束:Unix的结束符规定为'\n',苹果系统则为'\r',还有Windows规定为'\r\n',所有这些规定在Python程序中统一为'\n'.
  - 'w': 以可写方式打开存在或者不存在的文件，若文件不存在则先新建该文件，若文件存在则覆盖该文件
  - 'a': 用于追加，对unix系统而言,所有的内容都将追加到文件末尾而不管指针的当前位置如何
  - 'b': 以二进制方式打开。打开一个二进制文件必须用该模式。增加'b'模式是用来兼容系统对当二进制和文本文件的处理不同
  - 'r+','w+'和'a+'以更新方式打开文件(注意'w+'覆盖文件)
尝试for .. in ..循环可以对哪些数据类型进行操作?
- for..in循环对于任何序列（列表，元组，字符串）都适用。但从广义说来可以使用任何种类的由任何对象组成的序列
格式化声明,还有哪些格式可以进行约定?
- 格式化申明
- 详细：http://docs.python.org/lib/typesseq-strings.html (精巧地址: http://bit.ly/2TH7cF)
  - d Signed integer decimal.
  - i Signed integer decimal.
  - o Unsigned octal.
  - u Unsigned decimal.
  - x Unsigned hexadecimal (lowercase).
  - X Unsigned hexadecimal (uppercase).
  - e Floating point exponential format (lowercase).
  - E Floating point exponential format (uppercase).
  - f Floating point decimal format.
  - F Floating point decimal format.
  - g Floating point format. Uses exponential format if exponent is greater than -4 or less than precision, decimal format otherwise.
  - G Floating point format. Uses exponential format if exponent is greater than -4 or less than precision, decimal format otherwise.
  - c Single character (accepts integer or single character string).
  - r String (converts any python object using repr()).
  - s String (converts any python object using str()).
  - % No argument is converted, results in a "%" character in the result.
现在的写入文件模式好嘛? 有改进的余地?
- CDay-4-5.py 好在哪里?
```
   1 # coding : utf-8
   2 
   3 import os
   4 
   5 export = ""
   6 for root, dirs, files in os.walk('/media/cdrom0'):
   7   export+="\n %s;%s;%s" % (root,dirs,files)
   8 open('mycd2.cdc', 'w').write(export)
```
- CDay-4-6.py又更加好在哪里?
```
   1 # coding : utf-8
   2 
   3 import os
   4 
   5 export = []
   6 for root, dirs, files in os.walk('/media/cdrom0'):
   7     export.append("\n %s;%s;%s" % (root,dirs,files))
   8 open('mycd2.cdc', 'w').write(''.join(export))
```
- CDay-4-5.py中使用了字符串的+连接，而CDay-4-6.py中是利用join。字符串的join要比+操作效率高。因为对象的反复+，比一次性内建处理，要浪费更多的资源。

读取文件cdays-4-test.txt内容，去除空行和注释行后，以行为单位进行排序，并将结果输出为cdays-4-result.txt。

cdays-4-test.txt

#some words

Sometimes in life,
You find a special friend;
Someone who changes your life just by being part of it.
Someone who makes you laugh until you can't stop;
Someone who makes you believe that there really is good in the world.
Someone who convinces you that there really is an unlocked door just waiting for you to open it.
This is Forever Friendship.
when you're down,
and the world seems dark and empty,
Your forever friend lifts you up in spirits and makes that dark and empty world
suddenly seem bright and full.
Your forever friend gets you through the hard times,the sad times,and the confused times.
If you turn and walk away,
Your forever friend follows,
If you lose you way,
Your forever friend guides you and cheers you on.
Your forever friend holds your hand and tells you that everything is going to be okay.

源代码

   1 #coding:utf-8
   2 '''cdays-4-exercise-6.py 文件基本操作
   3     @note: 文件读取写入, 列表排序, 字符串操作
   4     @see: 字符串各方法可参考hekp(str)或Python在线文档http://docs.python.org/lib/string-methods.html
   5 '''
   6 
   7 f = open('cdays-4-test.txt', 'r')                   #以读方式打开文件
   8 result = list()
   9 for line in f.readlines():                          #依次读取每行
  10     line = line.strip()                             #去掉每行头尾空白
  11     if not len(line) or line.startswith('#'):       #判断是否是空行或注释行
  12         continue                                    #是的话，跳过不处理
  13     result.append(line)                             #保存
  14 result.sort()                                       #排序结果
  15 print result
  16 open('cdays-4-result.txt', 'w').write('%s' % '\n'.join(result)) #保存入结果文件

运行截屏

CDays-3

根据DiPy 10.6. 处理命令行参数(http://www.woodpecker.org.cn/diveintopython/scripts_and_streams/command_line_arguments.html 精巧地址:http://bit.ly/1x5gMw)使用getopt.getopt()优化当前功能函式。

源代码

   1 # coding=utf-8
   2 '''Lovely Python -3 PyDay 
   3     PyCDC v0.3
   4     @see：http://www.woodpecker.org.cn/diveintopython/scripts_and_streams/command_line_arguments.html 
   5 '''
   6 import os,sys
   7 import getopt       #导入getopt模块
   8 
   9 CDROM = '/media/cdrom0'
  10 def cdWalker(cdrom,cdcfile):
  11     export = ""
  12     for root, dirs, files in os.walk(cdrom):
  13         export+="\n %s;%s;%s" % (root,dirs,files)
  14     open(cdcfile, 'w').write(export)
  15 
  16 def usage():
  17     print '''PyCDC 使用方式:
  18     python cdays-3-exercise-1.py -d cdc -k 中国火
  19     #搜索 cdc 目录中的光盘信息，寻找有“中国火”字样的文件或是目录，在哪张光盘中
  20         '''
  21 try:
  22     opts, args = getopt.getopt(sys.argv[1:], 'hd:e:k:')
  23 except getopt.GetoptError:
  24     usage()
  25     sys.exit()
  26 
  27 if len(opts) == 0:
  28     usage()
  29     sys.exit()
  30 
  31 c_path = ''
  32 for opt, arg in opts:
  33     if opt in ('-h', '--help'):
  34         usage()
  35         sys.exit()
  36     elif opt == '-e':
  37         #判别sys.argv[2]中是否有目录，以便进行自动创建
  38         #cdWalker(CDROM, arg)
  39         print "记录光盘信息到 %s" % arg
  40     elif opt == '-d':
  41         c_path = arg
  42     elif opt == '-k':
  43         if not c_path:
  44             usage()
  45             sys.exit()
  46         #进行文件搜索

读取某一简单索引文件cdays-3-test.txt，其每行格式为文档序号关键词，现需根据这些信息转化为倒排索引，即统计关键词在哪些文档中，格式如下：包含该关键词的文档数关键词 => 文档序号。其中，原索引文件作为命令行参数传入主程序，并设计一个collect函式统计 "关键字<－>序号" 结果对，最后在主程序中输出结果至屏幕。

cdays-3-test.txt 内容:

1 key1
2 key2
3 key1
7 key3
8 key2
10 key1
14 key2
19 key4
20 key1
30 key3

源代码

   1 #coding:utf-8
   2 '''cdays-3-exercise-2.py 字典的使用
   3     @not: 使用sys.args, 字典操作, 函式调用
   4     @see: sys模块参见help(sys)
   5 '''
   6 
   7 import sys                                          #导入sys模块
   8 
   9 def collect(file):
  10     ''' 改变 key-value对为value-key对
  11     @param file: 文件对象
  12     @return: 一个dict包含value-key对
  13     '''
  14     result = {}
  15     for line in file.readlines():                   #依次读取每行
  16         left, right = line.split()                  #将一行以空格分割为左右两部分
  17         if result.has_key(right):                   #判断是否已经含有right值对应的key
  18             result[right].append(left)              #若有，直接添加到result[right]的值列表
  19         else:
  20             result[right] = [left]                  #没有，则新建result[right]的值列表
  21     return result
  22 
  23 if __name__ == "__main__":
  24     if len(sys.argv) == 1:                          #判断参数个数
  25         print 'usage:\n\tpython cdays-3-exercise-2.py cdays-3-test.txt'
  26     else:
  27         result = collect(open(sys.argv[1], 'r'))    #调用collect函式，返回结果
  28         for (right, lefts) in result.items():       #输出结果
  29             print "%d '%s'\t=>\t%s" % (len(lefts), right, lefts)

运行截屏

八皇后问题。在8*8的棋盘上，放置8个皇后，使得任两个皇后不在同行同列同正负对角线上。

源代码

   1 #coding:utf-8
   2 '''cdays-3-exercise-3.py
   3     @note: 使用全局变量和函式的递归调用
   4 '''
   5 
   6 global col                                  #定义一些全局变量
   7 global row
   8 global pos_diag
   9 global nag_diag
  10 global count
  11 
  12 def output():   
  13     ''' 输出一种有效结果
  14     '''
  15     global count
  16     print row
  17     count += 1
  18 
  19 def do_queen(i):
  20     ''' 生成所有正确解
  21     @param i: 皇后的数目
  22     '''
  23     for j in range(0, 8):                   #依次尝试0～7位置
  24         if col[j] == 1 and pos_diag[i-j+7] == 1 and nag_diag[i+j] == 1: #若该行，正对角线，负对角线上都没有皇后，则放入i皇后
  25             row[i] = j
  26             col[j] = 0                      #调整各个列表状态
  27             pos_diag[i-j+7] = 0
  28             nag_diag[i+j] = 0
  29             if i < 7:
  30                 do_queen(i+1)               #可递增或递减
  31             else:
  32                 output()                    #产生一个结果，输出
  33             col[j] = 1                      #恢复各个列表状态为之前的
  34             pos_diag[i-j+7] = 1
  35             nag_diag[i+j] = 1
  36 
  37 if __name__ == '__main__':
  38     col = []                                #矩阵列的列表，存储皇后所在列，若该列没有皇后，则相应置为1，反之则0
  39     row = []                                #矩阵行的列表，存放每行皇后所在的列位置，随着程序的执行，在不断的变化中，之间输出结果
  40     pos_diag = []                           #正对角线，i-j恒定，-7~0~7，并且b(i)+7统一到0～14
  41     nag_diag = []                           #负对角线，i+j恒定，0～14
  42     count = 0
  43     for index in range(0, 8):               #一些初始化工作
  44         col.append(1)
  45         row.append(0)
  46     for index in range(0, 15):
  47         pos_diag.append(1)
  48         nag_diag.append(1)
  49     do_queen(0)                             #开始递归，先放一个，依次递增，反过来，从7开始递减也可
  50     print 'Totally have %d solutions!' % count

运行截屏

CDays-2

在文中grep实现例子中,没有考虑子目录的处理，因为如果直接open目录进行读操作会出现错误，所以要求读者修改这个示例代码以便考虑到子目录这种特殊情况，然后把最后探索出的 cdcGrep()嵌入 pycdc-v0.5.py 实现完成版本的 PyCDC。提示：子目录处理，可以先判断，如果是子目录，就可以递归调用cdcGrep()函式。

cdcGrep()函式的修改可以是

   1 def cdcGrep(cdcpath,keyword):
   2     '''光盘信息文本关键词搜索函式
   3     @note: 使用最简单的内置字串匹配处理来判定是否有关键词包含
   4     @param cdcpath: 包含*.cdc 文件的目录
   5     @param keyword: 搜索的关键词
   6     @return: 组织匹配好的信息到字典中导出成 searched.dump 文件
   7     @todo: 可结合搜索引擎进行模糊搜索!
   8     '''
   9     expDict = {}
  10     filelist = os.listdir(cdcpath)          # 搜索目录中的文件
  11     cdcpath=cdcpath+"/"
  12     for cdc in filelist:                    # 循环文件列表
  13         if os.path.isdir(cdcpath+cdc):
  14             cdcGrep(cdcpath+cdc,keyword) # 若是子目录，则递归调用完成查找
  15         else:
  16             cdcfile = open(cdcpath+cdc)         # 拼合文件路径，并打开文件
  17             for line in cdcfile.readlines():    # 读取文件每一行，并循环
  18                 if keyword in line:             # 判定是否有关键词在行中       
  19                     #print line                  # 打印输出
  20                     expDict[cdc].append(line)
  21     #print expDict
  22     pickle.dump(expDict,open("searched.dump","w"))

源代码

   1 # coding= utf-8
   2 '''pycdc-v0.5.py
   3 Lovely Python -2 PyDay 
   4 @note: 将cdcGrep()嵌入 , 实现完成版本的 PyCDC
   5 '''
   6 import sys, cmd
   7 from cdctools import *
   8 class PyCDC(cmd.Cmd):
   9     def __init__(self):
  10         cmd.Cmd.__init__(self)                # initialize the base class
  11         self.CDROM = '/media/cdrom0'
  12         self.CDDIR = 'cdc/'
  13         self.prompt="(PyCDC)>"
  14         self.intro = '''PyCDC0.5 使用说明:
  15     dir 目录名     #指定保存和搜索目录，默认是 "cdc"
  16     walk 文件名    #指定光盘信息文件名，使用 "*.cdc"
  17     find 关键词    #遍历搜索目录中所有.cdc文件，输出含有关键词的行
  18     ?           # 查询
  19     EOF         # 退出系统，也可以使用Crtl+D(Unix)|Ctrl+Z(Dos/Windows)
  20         '''
  21 
  22     def help_EOF(self):
  23         print "退出程序 Quits the program"
  24     def do_EOF(self, line):
  25         sys.exit()
  26 
  27     def help_walk(self):
  28         print "扫描光盘内容 walk cd and export into *.cdc"
  29     def do_walk(self, filename):
  30         if filename == "":filename = raw_input("输入cdc文件名:: ")
  31         print "扫描光盘内容保存到:'%s'" % filename
  32         cdWalker(self.CDROM,self.CDDIR+filename)
  33 
  34     def help_dir(self):
  35         print "指定保存/搜索目录"
  36     def do_dir(self, pathname):
  37         if pathname == "": pathname = raw_input("输入指定保存/搜索目录: ")
  38         self.CDDIR = pathname
  39         print "指定保存/搜索目录:'%s' ;默认是:'%s'" % (pathname,self.CDDIR)
  40 
  41     def help_find(self):
  42         print "搜索关键词"
  43     def do_find(self, keyword):
  44         if keyword == "": keyword = raw_input("输入搜索关键字: ")
  45         print "搜索关键词:'%s'" % keyword
  46         cdcGrep(self.CDDIR,keyword)
  47 
  48 if __name__ == '__main__':      # this way the module can be
  49     cdc = PyCDC()            # imported by other programs as well
  50     cdc.cmdloop()

编写一个类，实现简单的栈。数据的操作按照先进后出(FILO)的顺序。主要成员函式为put(item)，实现数据item插入栈中；get()，实现从栈中取一个数据。

源代码

   1 #coding:utf-8
   2 '''cdays-2-exercise-2.py 自定义栈
   3     @note: 类和对象的使用
   4 '''
   5 
   6 class MyStack(object):
   7     '''MyStack
   8         自定义栈，主要操作有put(), get() and isEmpty()
   9     '''
  10     def __init__(self, max):
  11         '''
  12         初始栈头指针和清空栈
  13         @param max: 指定栈的最大长度
  14         '''
  15         self.head = -1
  16         self.stack = list()
  17         self.max = max
  18         for i in range(self.max):
  19             self.stack.append(0)
  20     
  21     def put(self, item):
  22         '''
  23         将item压入栈中
  24         @param item: 所要入栈的项
  25         '''
  26         if self.head >= self.max:                       #判断当前栈是否满了
  27             return 'Put Error: The Stack is Overflow!'  #提示栈溢出
  28         else:
  29             self.head += 1                              #不满，则将item入栈，调整栈顶指针
  30             self.stack[self.head] = item
  31             print 'Put %s Success' % item
  32     
  33     def get(self):
  34         '''
  35         获得当前栈顶item
  36         @return: 栈顶item
  37         '''
  38         if self.head < 0:                               #判断当前栈是否为空
  39             return 'Get Error: The Stack is Empty!'     #提示栈空
  40         else:
  41             self.head -= 1                              #出栈，返回栈顶元素，并调整栈顶指针
  42             return self.stack[self.head+1]
  43     
  44     def isEmpty(self):
  45         '''
  46         获得当前栈的状态，空或者非空
  47         @return: True(栈空) or False(栈非空)
  48         '''
  49         if self.head < -1:
  50             return True
  51         return False
  52 
  53 if __name__ == "__main__":
  54     mystack = MyStack(100)
  55     mystack.put('a')
  56     mystack.put('b')
  57     print mystack.get()
  58     mystack.put('c')
  59     print mystack.get()
  60     print mystack.get()
  61     print mystack.get()

运行截屏

CDays-1

自动判定你自个儿／或是朋友的Blog 是什么编码的?

源代码

   1 #coding:utf-8
   2 '''cdays-1-exercise-1.py 
   3     @author: U{shengyan<mailto:[email protected]>}
   4     @version:$Id$
   5     @note: 使用chardet和 urllib2
   6     @see: chardet使用文档: http://chardet.feedparser.org/docs/, urllib2使用参考: http://docs.python.org/lib/module-urllib2.html
   7 '''
   8 
   9 import sys
  10 import urllib2
  11 import chardet
  12 
  13 def blog_detect(blogurl):
  14     '''
  15     检测blog的编码方式
  16     @param blogurl: 要检测blog的url
  17     '''
  18     try:
  19         fp = urllib2.urlopen(blogurl)                       #尝试打开给定url
  20     except Exception, e:                                    #若产生异常，则给出相关提示并返回
  21         print e
  22         print 'download exception %s' % blogurl
  23         return 0
  24     blog = fp.read()                                        #读取内容
  25     codedetect = chardet.detect(blog)["encoding"]           #检测得到编码方式
  26     print '%s\t<-\t%s' % (blogurl, codedetect)
  27     fp.close()                                              #关闭
  28     return 1
  29     
  30 if __name__ == "__main__":
  31     if len(sys.argv) == 1:
  32         print 'usage:\n\tpython cdays-1-exercise-1.py http://xxxx.com'
  33     else:
  34         blog_detect(sys.argv[1])

运行截屏

如果是非utf-8 的,编写小程序自动将指定文章转换成utf-8 编码保存?

源代码

   1 #coding:utf-8
   2 '''cdays-1-exercise-2.py 熟悉chardet和urllib2
   3     @author: U{shengyan<mailto:[email protected]>}
   4     @version:$Id$
   5     @note: 使用chardet和 urllib2
   6     @see: chardet使用文档: http://chardet.feedparser.org/docs/, urllib2使用参考: http://docs.python.org/lib/module-urllib2.html
   7 '''
   8 import sys
   9 import urllib2
  10 import chardet
  11 
  12 def blog_detect(blogurl):
  13     '''
  14     检测blog的编码方式
  15     @param blogurl: 要检测blog的url
  16     '''
  17     try:
  18         fp = urllib2.urlopen(blogurl)                           #尝试打开给定url
  19     except Exception, e:                                        #若产生异常，则给出相关提示并返回
  20         print e
  21         print 'download exception %s' % blogurl
  22         return 0
  23     blog = fp.read()                                            #读取内容
  24     fp.close()                                                  #关闭
  25     codedetect = chardet.detect(blog)["encoding"]               #检测得到编码方式
  26     if codedetect <> 'utf-8':                                   #是否是utf-8
  27         try:
  28             blog = unicode(blog, codedetect)                    #不是的话，则尝试转换
  29             #print blog
  30             blog = blog.encode('utf-8')
  31         except:
  32             print u"bad unicode encode try!"
  33             return 0
  34     filename = '%s_utf-8' % blogurl[7:]                         #保存入文件
  35     filename = filename.replace('/', '_')
  36     open(filename, 'w').write('%s' % blog)
  37     print 'save to file %s' % filename
  38     return 1
  39     
  40 if __name__ == "__main__":
  41     if len(sys.argv) == 1:
  42         print 'usage:\n\tpython cdays-1-exercise-2.py http://xxxx.com'
  43     else:
  44         blog_detect(sys.argv[1])

运行截屏

CDays0

请根据软件发布的流程和软件开发的编码规范，将读者之前章节写的程序修改并发布出去。另外，可以查找下除了epydoc外还有哪些较好的py文档生成器？

步骤：

编写epydoc的配置文件如cdays0-epydoc.cfg。

[epydoc] 
# Epydoc section marker (required by ConfigParser)

# Information about the project.
name: MyStack
url: http://wiki.woodpecker.org.cn/moin/ObpLovelyPython

# The list of modules to document.  Modules can be named using
# dotted names, module filenames, or package directory names.
# This option may be repeated.
modules:  ./cdays0-exercise-1.py

# Write html output to the directory "apidocs"
output: html
target: apidocs/

# Include all automatically generated graphs.  These graphs are
# generated using Graphviz dot.
graph: all
dotpath: /usr/bin/dot

在终端中输入epydoc --config cdays0-epydoc.cfg，即可生成文档。

运行截屏

CDays1

编程实现以下功能并进行最大化的优化：遍历指定目录下的所有文件，找出其中占用空间最大的前3个文件。

源代码

   1 #coding:utf-8
   2 '''cdays+1-exercise-1.py
   3     @note: 使用os.stat获取相关信息, os.walk遍历,
   4     @see: help(os)
   5     @author: U{shengyan<mailto:[email protected]>}
   6     @version: $Id$
   7 '''
   8 import sys
   9 import os
  10 
  11 def get_top_three(path):
  12     ''' 获取给定路径中文件大小最大的三个
  13     @param path: 指定路径
  14     @return 返回一个list，每项为 (size, filename)
  15     '''
  16     all_file = {}
  17     for root, dirs, files in os.walk(path):             #遍历path
  18         for onefile in files:
  19             fname = os.path.join(root, onefile)         #获得当前处理文件的完整名字
  20             fsize = os.stat(fname).st_size              #获得当前处理文件大小
  21             if all_file.has_key(fsize):                 #按照文件大小存储
  22                 all_file[fsize].append(fname)
  23             else:
  24                 all_file[fsize] = [fname]
  25     fsize_key = all_file.keys()                         #得到所有的文件大小
  26     fsize_key.sort()                                    #排序，从小到大
  27     result = []
  28     for i in [-1, -2, -3]:                              #依次取最大的三个
  29         for j in all_file[fsize_key[i]]:                #保存
  30             result.append((fsize_key[i], j))
  31     return result[:3]                                   #返回前三个
  32     
  33 if __name__ == "__main__":
  34     if len(sys.argv) == 1:
  35         print 'usage:\n\tpython cdays+1-exercise-1.py path'
  36     else:
  37         abs_path = os.path.abspath(sys.argv[1])         #得到绝对路径
  38         if not os.path.isdir(abs_path):                 #判断所给的路径是否存在
  39             print '%s is not exist' % abs_path
  40         else:
  41             top = get_top_three(abs_path)
  42             for (s, f) in top:
  43                 print '%s\t->\t%s' % (f, s)

运行截屏

利用ConfigParser，将上述题目中产生的结果按照cdays+1-my.ini格式存储到文件cdays+1-result.txt中。

cdays+1-my.ini内容为：

[Number]
filesize = somefilesize
filename = somefilename

源代码

   1 #coding:utf-8
   2 '''cdays+1-exercise-2.py 
   3     @note: 利用ConfigParser解析ini格式
   4     @see: 文档参见http://pydoc.org/2.4.1/ConfigParser.html, 其他例子http://effbot.org/librarybook/configparser-example-1.py
   5     @author: U{shengyan<mailto:[email protected]>}
   6     @version:$Id$
   7 '''
   8 import os
   9 import sys
  10 from ConfigParser import RawConfigParser
  11 
  12 def iniTT(size_file):
  13     ''' 按照.ini的格式，存储size_file
  14     '''
  15     cfg = RawConfigParser()
  16     print size_file
  17     index = 1
  18     for (s, f) in size_file:
  19         cfg.add_section("%d" % index)                   #增加一个section
  20         cfg.set("%d" % index, 'Filename', f)            #在该section设置Filename及其值
  21         cfg.set("%d" % index, 'FileSize', s)            #在该section设置FileSize及其值
  22         index += 1
  23 
  24     cfg.write(open('cdays+1-result.txt',"w"))
  25 
  26 def gtt(path):
  27     ''' 获取给定路径中文件大小最大的三个
  28     @param path: 指定路径
  29     @return 返回一个list，每项为 (size, filename)
  30     '''
  31     all_file = {}
  32     for root, dirs, files in os.walk(path):             #遍历path
  33         for onefile in files:
  34             fname = os.path.join(root, onefile)         #获得当前处理文件的完整名字
  35             fsize = os.stat(fname).st_size              #获得当前处理文件大小
  36             if all_file.has_key(fsize):                 #按照文件大小存储
  37                 all_file[fsize].append(fname)
  38             else:
  39                 all_file[fsize] = [fname]
  40     fsize_key = all_file.keys()                         #得到所有的文件大小
  41     fsize_key.sort()                                    #排序，从小到大
  42     result = []
  43     for i in [-1, -2, -3]:                              #依次取最大的三个
  44         for j in all_file[fsize_key[i]]:                #保存
  45             result.append((fsize_key[i], j))
  46     return result[:3]                                   #返回前三个
  47 
  48 if __name__ == "__main__":
  49     if len(sys.argv) == 1:
  50         print 'usage:\n\tpython cdays+1-exercise-2.py path'
  51     else:
  52         abs_path = os.path.abspath(sys.argv[1])
  53         if not os.path.isdir(abs_path):
  54             print '%s is not exist' % abs_path
  55         else:
  56             #from cdays+1-exercise-1 import get_top_three as gtt
  57             iniTT(gtt(abs_path))

运行结果

CDays2

如果在Karrigell 实例中，不复制 cdctools.py 到webapps 目录中，也可以令 index.ks 引用到？
- 不复制 cdctools.py 到webapp 目录中，也可以令 index.ks 引用到，可以通过以下方式：
  - 修改Python的环境变量PYTHONPATH，把cdctools.py的所在目录路径加入
  - 在程序里动态的修改sys.path
    1 # -*- coding: utf-8 -*- 2 3 import sys 4 5 # cdctools.py的路径添加到sys.path 6 sys.path.append('/home/shengyan/workspace/obp/CDays/cdays2/') 7 from cdctools import * 8 .......

经过本章Karrigell的初步学习，实现一个简易的web留言系统。主要利用Karrigell_QuickForm实现提交留言并显示出来。

步骤：

下载karrigell，解压后，根据默认设置直接就可以运行了，但一般修改conf/下Karrigell.ini中的port=8081，表示使用端口8081，保存
将msg拷贝至webapps/，并在index.pih中增加链接<a href='msg/'> Message</a>

编辑msg中的index.ks，完成所需功能

   1 # -*- coding: utf-8 -*-
   2 
   3 import os,sys
   4 import pickle            # 神奇的序列化模块
   5 from HTMLTags import *  # Karrigell 提供页面输出支持模块
   6 from Karrigell_QuickForm import Karrigell_QuickForm as KQF
   7 
   8 def _htmhead(title):
   9     '''默认页面头声明
  10     @note: 为了复用，特别的组织成独立函式,根据Karrigell 非页面访问约定，函式名称前加"_"
  11     @param title: 页面标题信息
  12     @return: 标准的HTML代码
  13     '''
  14     htm = """<html><HEAD>
  15 <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
  16 <title>%s</title></HEAD>
  17 <body>"""%title
  18     return htm
  19 ## 默认页面尾声明
  20 htmfoot="""
  21 <h5>design by:<a href="mailto:[email protected]">lizzie</a>
  22     powered by : <a href="http://python.org">Python</a> +
  23  <a href="http://karrigell.sourceforge.net"> KARRIGELL 2.4.0</a>
  24 </h5>
  25 </body></html>"""
  26 
  27 def index(**args):
  28     '''默认主
  29     @note: 使用简单的表单／链接操作来完成原有功能的界面化
  30     @param args: 数组化的不定参数
  31     @return: 标准的HTML页面
  32     '''
  33     print _htmhead("Leave Messages")
  34     p = KQF('fm_message','POST',"index","Message")
  35     p.addHtmNode('text','yname','Name',{'size':20,'maxlength':20})
  36     p.addTextArea('Words','10','90')
  37     p.addGroup(["submit","btn_submit","Submit","btn"])
  38     p.display()
  39     
  40     if 0 == len(QUERY):
  41         pass
  42     else:
  43         if "Submit" in QUERY['btn_submit']:
  44             yname = QUERY['yname']
  45             ywords = QUERY['Words']
  46             if 0 == len(ywords):
  47                 print H3("please say something!")
  48             else:
  49                 if 0 == len(yname):
  50                     yname = 'somebody'
  51                 try:
  52                     msg = pickle.load(open("message.dump"))
  53                 except:
  54                     msg = []
  55                 msg.append((yname, ywords))
  56                 index = len(msg)-1
  57                 while index >= 0:
  58                     print H5(msg[index][0]+' says: ')
  59                     print P('------ '+msg[index][1])
  60                     index -= 1
  61                 pickle.dump(msg,open("message.dump","w"))
  62         else:
  63             pass
  64     print htmfoot

cd至karrigell所在目录，输入python Karrigell.py运行后，在浏览器地址栏中输入localhost:8081就可以看到页面，点击Message链接即可到达。

运行截屏

思考，本日提出的，搜索结果积累想法，如何实现？如何在搜索时可以快速确认以前曾经搜索过相同的关键词，直接输出原先的搜索成果，不用真正打开CD信息文件来匹配？

步骤：

可以把之前搜索历史记录下来，这样就可以在下次查询某个关键词时，先查找这里的信息，若能够找到则直接可以返回结果，没有的话，再按照以前的方法遍历搜索，同时更新这个新关键词的信息。
1、修改cdctools.py中的cdcGrep函式，增加查找，更新历史记录文件，具体可参见代码。
2、命令行中测试。第一次搜索关键词EVA时，出现结果为：
{'z.MCollection.29.cdc': [], 'mfj-00.cdc': [], 'MCollec.39.cdc': [], 'z.Animation.00.cdc': ['[L:\\mAnimi\\Gainax\\EVAalbumESP]\r\n'], 'z.MFC.pop.02.cdc': []}这里的有效搜索信息就会增加到history_search.dump文件中。当再次搜索该词时，出现：{'z.Animation.00.cdc': ['[L:\\mAnimi\\Gainax\\EVAalbumESP]\r\n']}，因为导出格式一致，所以页面上无须修改。

源码：

   1 # -*- coding: utf-8 -*-
   2 
   3 HISTORY_SEARCH = './history_search.dump'
   4 def cdcGrep(cdcpath,keyword):
   5     '''光盘信息文本关键词搜索函式
   6     @note: 使用最简单的内置字串匹配处理来判定是否有关键词包含
   7     @param cdcpath: 包含*.cdc 文件的目录;运行前得在 __main__ 中修订成你当前的 cdc 数据存放点
   8     @param keyword: 搜索的关键词
   9     @return: 组织匹配好的信息到字典中导出成 searched.dump 文件
  10     @todo: 可结合搜索引擎进行模糊搜索!
  11     '''
  12     expDict = {}
  13     print cdcpath
  14     try:
  15         h_search = pickle.load(open(HISTORY_SEARCH))
  16         
  17         if h_search.has_key(keyword):  # 如果已有该关键字，则直接使用历史记录中的，导出格式不变。
  18             for (c, l) in h_search[keyword]:
  19                 if expDict.has_key(c):
  20                     expDict[c].append(l)
  21                 else:
  22                     expDict[c] = [l]
  23             pickle.dump(expDict,open("searched.dump","w"))
  24             return
  25     except:
  26         h_search = {}
  27     
  28     filelist = os.listdir(cdcpath)          # 搜索目录中的文件
  29     for cdc in filelist:                    # 循环文件列表
  30         if ".cdc" in cdc:
  31             cdcfile = open(cdcpath+cdc)         # 拼合文件路径，并打开文件
  32             expDict[cdc]=[]
  33             for line in cdcfile.readlines():    # 读取文件每一行，并循环
  34                 if keyword in line:             # 判定是否有关键词在行中
  35                     #print line                  # 打印输出
  36                     expDict[cdc].append(line)
  37                     if h_search.has_key(keyword):  # 保存keyword对应的结果，格式为{keyword:[(cdc, line)]}
  38                         h_search[keyword].append((cdc, line))
  39                     else:
  40                         h_search[keyword] = [(cdc, line)]
  41     #print expDict
  42     pickle.dump(expDict,open("searched.dump","w"))
  43     pickle.dump(h_search, open(HISTORY_SEARCH, 'w'))

CDays3

熟悉线程相关知识后，利用Lock和RLock实现线程间的简单同步，使得10个线程对同一共享变量进行递增操作，使用加锁机制保证变量结果的正确。

源代码

   1 #coding:utf-8
   2 '''cdays+3-exercise-1.py 使用Thread和RLock实现简单线程同步
   3     @note: Thread, RLock(http://docs.python.org/lib/rlock-objects.html)
   4     @see: 可参考http://linuxgazette.net/107/pai.html
   5     @author: U{shengyan<mailto:[email protected]>}
   6     @version:$Id$
   7 '''
   8 from threading import Thread
   9 from threading import RLock
  10 import time
  11 
  12 class myThread(Thread):
  13     '''myThread
  14         自定义的线程，多个线程共同访问一个变量
  15     '''
  16     def __init__(self, threadname):
  17         Thread.__init__(self, name = threadname)
  18 
  19     def run(self):
  20         global share_var            #共享一全局变量
  21         lock.acquire()              #调用lock的acquire，获得锁
  22         share_var += 1              #修改共享变量
  23         #time.sleep(2)
  24         print share_var
  25         lock.release()              #释放
  26         
  27 if __name__ == "__main__":
  28     share_var = 0
  29     lock = RLock()
  30     threadlist = []
  31 
  32     for i in range(10):             #产生10个线程
  33         my = myThread('Thread%d' % i)
  34         threadlist.append(my)
  35     for i in threadlist:            #开始10个线程
  36         i.start()

运行截屏

使用Queue实现多线程间的同步。比如说，十个输入线程从终端输入字符串，另十个输出线程依次获取字符串并输出到屏幕。

源代码

   1 #coding:utf-8
   2 '''cdays+3-exercise-2.py 使用Thread和Queue保持多线程间同步
   3     @see: Queue(http://doc.astro-wise.org/Queue.html)
   4     @author: U{shengyan<mailto:[email protected]>}
   5     @version:$Id$
   6 '''
   7 from threading import Thread
   8 import Queue
   9 import time
  10 
  11 class Input(Thread):
  12     '''输入线程： 从标准输入中读一个string，然后把该string加入到queue
  13     '''
  14     def __init__(self, threadname):
  15         Thread.__init__(self, name = threadname)
  16     def run(self):
  17         some_string = raw_input('please input something for thread %s:' % self.getName()) #输入一个字符串
  18         global queue
  19         queue.put((self.getName(), some_string))            #加入到队列
  20         #time.sleep(5)                                      #延时一段时间
  21         
  22 class Output(Thread):
  23     '''输出线程：从queue中得到一个string，并将它输出到屏幕
  24     '''
  25     def __init__(self, threadname):
  26         Thread.__init__(self, name = threadname)
  27     def run(self):
  28         global queue
  29         (iThread, something) = queue.get()                  #从queue中读取
  30         print 'Thread %s get "%s" from Thread %s' % (self.getName(), something, iThread) #输出
  31 
  32 if __name__ == "__main__":
  33     queue = Queue.Queue()                                   #创建Queue对象
  34     inputlist = []
  35     outputlist = []
  36     for i in range(10):
  37         il = Input('InputThread%d' % i)                     #输入线程列表
  38         inputlist.append(il)
  39         ol = Output('outputThread%d' % i)                   #输出线程列表
  40         outputlist.append(ol)
  41     for i in inputlist:
  42         i.start()                                           #依次开始输入线程   
  43         i.join()                                            #等待
  44     for i in outputlist:
  45         i.start()                                           #依次开始输出线程
  46         #i.join()

运行结果

Python中的Event是用于线程间的相互通信，主要利用信号量机制。修改题一的程序，利用信号量重新实现多线程对同一共享变量进行递增操作。

源代码

   1 #coding:utf-8
   2 '''cdays+3-exercise-3.py 使用Thread和Event实现简单的线程间通信
   3     @see: Event(http://docs.python.org/lib/event-objects.html)
   4     @author: U{shengyan<mailto:[email protected]>}
   5     @version:$Id$
   6 '''
   7 from threading import Thread
   8 from threading import Event
   9 import time
  10 
  11 class myThread(Thread):
  12     '''myThread
  13         自定义线程
  14     '''
  15     def __init__(self, threadname):
  16         Thread.__init__(self, name = threadname)
  17 
  18     def run(self):
  19         global event
  20         global share_var
  21         if event.isSet():           #判断event的信号标志
  22             event.clear()           #若设置了，则清除
  23             event.wait()            #并调用wait方法
  24             #time.sleep(2)
  25             share_var += 1          #修改共享变量
  26             print '%s ==> %d' % (self.getName(), share_var)
  27         else:
  28             share_var += 1          #未设置，则直接修改
  29             print '%s ==> %d' % (self.getName(), share_var)
  30             #time.sleep(1)
  31             event.set()             #设置信号标志
  32 
  33 if __name__ == "__main__":
  34     share_var = 0
  35     event = Event()                 #创建Event对象
  36     event.set()                     #设置内部信号标志为真
  37     threadlist = []
  38 
  39     for i in range(10):             #创建10个线程
  40         my = myThread('Thread%d' % i)
  41         threadlist.append(my)
  42     for i in threadlist:            #开启10个线程
  43         i.start()

运行截屏

ObpLovelyPython/LpyAttAnswerCdays (last edited 2009-12-25 07:08:53 by localhost)