文章来自《Python cookbook》.

翻译仅仅是为了个人学习,其它商业版权纠纷与此无关!

-- 大熊 [2004-10-25 05:41:38]

1. 描述

12.6 Transforming an XML Document Using Python Credit: David Ascher

12.6 使用Python来转换XML文档

感谢:David Ascher

1.1. 问题 Problem

12.6.1 Problem You have an XML document that you want to tweak.

12.6.1 问题

你有一个XML文档需要调整

1.2. 解决 Solution

12.6.2 Solution Suppose that you want to convert element attributes into child elements. A simple subclass of the XMLGenerator object gives you complete freedom in such XML-to-XML transformation tasks:

12.6.2 解决 假设你要转变元素的属性为元素的子元素。一个简单的XMLGenerator的子类使你可以自由进行这样的XML到XML的转换任务:

   1 from xml.sax import saxutils, make_parser
   2 import sys
   3 
   4 class Tweak(saxutils.XMLGenerator):
   5     def startElement(self, name, attrs):
   6         saxutils.XMLGenerator.startElement(self, name, {})
   7         attributes = attrs.keys(  )
   8         attributes.sort(  )
   9         for attribute in attributes:
  10             self._out.write("<%s>%s</%s>" % (attribute,
  11                             attrs[attribute], attribute))
  12 
  13 parser = make_parser(  )
  14 dh = Tweak(sys.stdout)
  15 parser.setContentHandler(dh)
  16 parser.parse(sys.argv[1])

1.3. 讨论 Discussion

12.6.3 Discussion This particular recipe defines a Tweak subclass of the XMLGenerator class provided by the xml.sax.saxutils module. The only purpose of the subclass is to perform special handling of element starts while relying on its base class to do everything else. SAX is a nice and simple (after all, that's what the S stands for) API for processing XML documents. It defines various kinds of events that occur when an XML document is being processed, such as startElement and endElement.

12.6.3 讨论

这个特殊的处方定义了一个XMLGenerator(由xml.saxsaxutils模块提供)的子类Tweak。这个子类的唯一的目的是在元素开始的时候执行特定的处理,同时让基类做剩余的事情。SAX是一个极好又简单(毕竟缩写中S就是指的这个)的用来处理XML文档的API。它定义了各种事件,这些事件在处理XML文档是会被触发,例如startElement和endElement事件。

The key to understanding this recipe is to understand that Python's XML library provides a base class, XMLGenerator, which performs an identity transform. If you feed it an XML document, it will output an equivalent XML document. Using standard Python object-oriented techniques of subclassing and method override, you are free to specialize how the generated XML document differs from the source. The code above simply takes each element (attributes and their values are passed in as a dictionary on startElement calls), relies on the base class to output the proper XML for the element (but omitting the attributes), and then writes an element for each attribute.

理解这个处方的关键是要明白Python的XML库中提供了一个基类XMLGenerator,可以执行同样的转换。如果你使用它来处理一个XML文档,它将输出一个等同的XML文档。使用标准的Python面向对象技术子类化和方法重载,你可以自由的由源文档序列化产生不同的XML文档。上面的代码简单的取得每个元素(属性以及属性值作为一个字典参数传入startElement方法),依靠基类为元素输出正确的XML(但忽略属性),然后把每个属性写为一个元素。

Subclassing the XMLGenerator class is a nice place to start when you need to tweak some XML, especially if your tweaks don't require you to change the existing parent-child relationships. For more complex jobs, you may want to explore some other ways of processing XML, such as minidom or pulldom. Or, if you're really into that sort of thing, you could use XSLT (see Recipe 12.5).

当你需要调整一些XML时,子类化XMLGenerator是一个很好的开始,特别是如果你的调整无需改变现有节点间的关系。对于那些更为复杂的工作,你可能需要浏览一些其他的处理XML的方法,例如使用minidom或pulldom。如果你真的是这样的话,那你还可能要使用XSLT(参见处方12.5)。

1.4. 参考 See Also

12.6.4 See Also Recipe 12.5 for various ways of driving XSLT from Python; Recipe 12.2, Recipe 12.3, and Recipe 12.4 for other uses of the SAX API.

12.6.4 参考

处方12.5演示了不同与本处方的方法:从Python中驱动XSLT;处方12.2,处方12.3以及处方12.4演示了SAX API的其它用途。