Python性能调试笔记
::-- Roka [2007-04-26 14:46:55]
1. 概要
TODO
1.1. 字符串连接
(1)
普通代码:
1 s = ""
2 for substring in list:
3 s += substring
高性能代码:
1 s = "".join(list)
(2)
普通代码:
1 s = ""
2 for x in list:
3 s += someFunction(x)
高性能代码:
1 slist = [someFunction(x) for x in somelist]
2 s = "".join(slist)
(3)
普通代码:
1 out = "<html>" + head + prologue + query + tail + "</html>"
高性能代码:
1 out = "<html>%(head)s%(prologue)s%(query)s%(tail)s</html>" % locals()
1.2. 循环
(1)一个转换大写的例程:
普通代码:
1 newlist = []
2 for word in oldlist:
3 newlist.append(word.upper())
高性能代码:
1 #(map()函数是C语言实现,性能比较高,但是会在Py3000消失)
2 newlist = map(str.upper, oldlist)
3 #Or(List comprehensions, Py > 2.0)
4 newlist = [s.upper() for s in oldlist]
5 #Or(Generator expressions, Py > 2.4)
6 newlist = (s.upper() for s in oldlist)
1.3. 面向对象
(1)假设不能使用map()和list comprehension,你只能使用循环时要避免”带点循环“:
1 upper = str.upper
2 newlist = []
3 append = newlist.append
4 # loop without dots
5 for word in list:
6 append(upper(word))
1.4. 本地变量
(1)终极办法-使用本地变量代替全局变量
1 def func():
2 upper = str.upper
3 newlist = []
4 append = newlist.append
5 for word in words:
6 append(upper(word))
7 return newlist
1.5. 字典
(1)不要带IF循环:
普通代码:
1 wdict= {}
2 for word in words:
3 if word not in wdict:
4 wdict[word] = 0
5 wdict[word] += 1
高性能代码:
1 #(Py < 2.x)
2 wdict = {}
3 for word in words:
4 try:
5 wdict[word] += 1
6 except KeyError:
7 wdict[word] = 1
8
9 #(Py > 2.x)
10 wdict = {}
11 get = wdict.get
12 for word in words:
13 wdict[word] = get(word, 0) + 1
如果在字典里的是对象或列表,你还可以用dict.setdefault 方法
1 wdict.setdefault(key, []).append(newElement)
1.6. Import
(1)在本地import会比全局import高效。
(2)保证只import一次。
1 #check
2 pack = None
3
4 def parse_pack():
5 global pack
6 if pack is None:
7 import pack
8 ...
1.7. 数据集合处理
(1)避免在循环中进行函数调用
普通代码:
1 import time
2 x = 0
3 def doit(i):
4 global x
5 x = x + 1
6
7 list = range(100000)
8 t = time.time()
9 for i in list:
10 doit(i)
11
12 print "%.3f" %(time.time() -t )
高性能代码:
1 import time
2 x = 0
3 def doit(i):
4 global x
5 for i in list:
6 x = x + 1
7 x = x + 1
8
9 list = range(100000)
10 t = time.time()
11 doit(list)
12
13 print "%.3f" %(time.time() -t )
(什么??竟然快了4倍以上!!)
1.8. 使用xrange()代替range()
1 # Measuring the performance using profile mod
2
3 def myFunc():
4 b = []
5 a = [[1,2,3],[4,5,6]]
6 for x in range(len(a)):
7 for y in range(len(a[x])):
8 b.append(a[x][y])
9
10 import profile
11 profile.run("myFunc()","myFunc.profile")
12 import pstats
13 pstats.Stats("myFunc.profile").sort_stats("time").print_stats()
结果:
Wed May 23 12:05:07 2007 myFunc.profile 16 function calls in 0.001 CPU seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.001 0.001 0.001 0.001 :0(setprofile) 1 0.000 0.000 0.001 0.001 profile:0(myFunc()) 1 0.000 0.000 0.000 0.000 D:/Python25/measuringPerf.py:7(myFunc) 6 0.000 0.000 0.000 0.000 :0(append) 3 0.000 0.000 0.000 0.000 :0(range) 1 0.000 0.000 0.000 0.000 <string>:1(<module>) 3 0.000 0.000 0.000 0.000 :0(len) 0 0.000 0.000 profile:0(profiler)
现在替换range()为xrange():
1 # Measuring the performance using profile mod
2
3 def myFunc():
4 b = []
5 a = [[1,2,3],[4,5,6]]
6 for x in xrange(len(a)):
7 for y in xrange(len(a[x])):
8 b.append(a[x][y])
9
10 import profile
11 profile.run("myFunc()","myFunc.profile")
12 import pstats
13 pstats.Stats("myFunc.profile").sort_stats("time").print_stats()
结果:
Wed May 23 12:05:59 2007 myFunc.profile 13 function calls in 0.001 CPU seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.001 0.001 0.001 0.001 :0(setprofile) 1 0.000 0.000 0.001 0.001 profile:0(myFunc()) 1 0.000 0.000 0.000 0.000 D:/Python25/measuringPerf.py:7(myFunc) 6 0.000 0.000 0.000 0.000 :0(append) 1 0.000 0.000 0.000 0.000 <string>:1(<module>) 3 0.000 0.000 0.000 0.000 :0(len) 0 0.000 0.000 profile:0(profiler)
注意到函数调用次数由16减少到了13,
虽然使用的CPU时间是一样的,但只是执行一次的结果。
注: (ncalls):调用次数。 (tottime):总函数耗时(不包括子函数) (cumtime):总函数耗时(包括子函数) (percall):平均调用时间
毕竟xrange()是C完全实现的。