文章来自《Python cookbook》.


-- Zoom.Quiet [2004-08-11 16:57:15]


2.7 Sorting a List of Objects by an Attribute of the Objects 通过对象的属性来排序对象的列表

Credit: Yakov Markovitch, Nick Perkins

问题 Problem

You have a list of objects that you need to sort according to one attribute of each object, as rapidly and portably as possible.


解决 Solution

In this case, the obvious approach is concise, but quite slow: 在这种情况下,明显的方法是简明的,但是缓慢的:

   1 def sort_by_attr_slow(seq, attr):
   2     def cmp_by_attr(x, y, attr=attr):
   3         return cmp(getattr(x, attr), getattr(y, attr))
   4     seq.sort(cmp_by_attr)

There is a faster way, with DSU:


   1 def sort_by_attr(seq, attr):
   2     import operator
   4     intermed = map(None, map(getattr, seq, (attr,)*len(seq)),
   5         xrange(len(seq)), seq)
   6     intermed.sort(  )
   7     return map(operator.getitem, intermed, (-1,)*len(intermed))
   9 def sort_by_attr_inplace(lst, attr):
  10     lst[:] = sort_by_attr(lst, attr)

讨论 Discussion

Sorting a list of objects by an attribute of each object is best done using the DSU idiom. Since this recipe uses only built-ins and doesn't use explicit looping, it is quite fast. Moreover, the recipe doesn't use any Python 2.0-specific features (such as zip or list comprehensions), so it can be used for Python 1.5.2 as well.

通过每个对象的属性来排序对象的list, 用DSU是最好的。因为这个配方仅仅使用内建的并且不使用显式的循环。它相当的快。而且,配方不使用任何python2.0相关的特性(诸如zip或者list内涵)。所以它也能背python1.5.2使用。

List comprehensions are neater, but this recipe demonstrates that you can do without them, if and when you desperately need portability to stone-age Python installations. Of course, the correct use of map can be tricky. Here, for example, when we build the auxiliary list intermed, we need to call the built-in function getattr on each item of sequence seq using the same string, attr, as the second argument in each case. To do this, in the inner call to map, we need a tuple in which attr is repeated as many times as there are items in seq. We build that tuple by the not immediately obvious expression:

List内涵是灵巧的,但是这个配方证明你不用他们也能做。特别是如果当你很失望的要移植他们到石器时代的python配置上。当然map的正确使用是聪明的。 这里,例如,当我们构造辅助list的时候,我们需要在使用相同的string, attr,的序列seq的每个条目上调用内建的函数getattr, 就象在每个案例中的第二个参数。


which is len(seq) repetitions of the one-item tuple (attr,), whose only item is exactly attr.

那个是单条目tuple (attr,)的len(seq)重复,它唯一的条目正是attr。

If you do want to use list comprehensions, that's easy, too. Just substitute the sort_by_attr of the recipe with the following alternative version:

如果你想使用list内涵,那也很容易。 仅仅用下面的可选版本来替换配方的sort_by_attr

   1 def sort_by_attr2(seq, attr):
   2     intermed = [(getattr(seq[i], attr), i, seq[i]) for i in xrange(len(seq))]
   3     intermed.sort(  )
   4     return [ tup[-1] for tup in intermed ]

However, if this piece of code is run in a speed-critical bottleneck of your program, you should carefully measure performance. map is often surprisingly fast when compared to list comprehensions, at least when no lambda is involved in the map. The difference in performance is not huge either way, so it's worth exploring only if this code is run in a speed-critical bottleneck.


Whether you use map or list comprehensions, the point of the DSU idiom is to gain both speed and flexibility (for example, making the sort a stable one, as we did here) when compared to passing a comparison function to the sort method. For a simpler application and a somewhat wider discussion of DSU, see Recipe 2.4.

是否你使用map或者list内涵,当比较传递给排序方法的一个比较函数,DSU的观点是速度和灵活性都要获得(例如,使排序稳定, 就象我们在这里做的一样。)。 对于一个简单的应用和一个更广泛的DSU讨论, 请看 PyCkBk-2-4--Recipe 2.4

Note that in addition to making the sort stable, putting the index i as the second item of each tuple in the auxiliary list intermed (through the insertion of xrange in the call to map or, more simply, that of i in the list-comprehension version sort_by_attr2) also serves a potentially crucial role here. It ensures that two objects (two items of seq) will never be compared, even if their values are the same for the attribute named attr, because even in that case their indexes will surely differ, and thus the lexicographic comparison of the tuples will never get all the way to comparing the tuples' last items (the objects). Avoiding object comparison may save us from extremely slow operations or even from attempting forbidden ones. For example, when we sort a list of complex numbers by their real attributes, in Python 2.0 or later, we will get an exception if we try to compare two complex numbers directly, as no ordering is defined on complex numbers.

注意, 另外使排序稳定,放置索引i作为在辅助list里每个tuple的第二个元素(尽管在xrange的插入调用map或者比i在list内涵的版本sort_by_attr2更简单)也在这里扮演了至关重要的角色。 它确保两个对象(seq的两个条目)将从不被比较,即使他们的值对于命名为attr的属性来说是相同的。因为甚至在那种情况下他们的索引也的确不通,并且,tuple的词典比较将从未得到所有的途径去比较tuple的作后的条目(对象)。避免对象比较可以把我们从极端慢的操作或者甚至于企图阻止它的困境中解救出来。例如,当我们通过真正的属性来排列一个复数的list,在python2.0或者以后的版本,如果我们企图去直接比较两个复数的话,我们将得到一个异常。因为在复数上,没有顺序的定义。

参考 See Also

PyCkBk-2-4--Recipe 2.4.

PyCkBk-2-7 (last edited 2009-12-25 07:14:02 by localhost)