-- 61.182.251.99 [2004-09-23 16:58:36]

Contents

1. 描述

# 描述

Reading Lines with Continuation Characters 读取含有行连接字符的文件行

## 问题 Problem

You have a file that includes long logical lines split over two or more physical lines, with backslashes to indicate that a continuation line follows. You want to process a sequence of logical lines, rejoining those split lines.

## 解决 Solution

As usual, a class is the right way to wrap this functionality in Python 2.1:

Python 2.1中，正确方法仍是使用封装类：

```   1 class LogicalLines:
2
3     def _ _init_ _(self, fileobj):
4
5         # Ensure that we get a line-reading sequence in the best way possible:
7         try:
8             # Check if the file-like object has an xreadlines method
10         except AttributeError:
11             # No, so fall back to the xreadlines module's implementation
13
14         self.phys_num = 0  # current index into self.seq (physical line number)
15         self.logi_num = 0  # current index into self (logical line number)
16
17     def _ _getitem_ _(self, index):
18         if index != self.logi_num:
19             raise TypeError, "Only sequential access supported"
20         self.logi_num += 1
21         result = []
22         while 1:
23             # Intercept IndexError, since we may have a last line to return
24             try:
25                 # Let's see if there's at least one more line in self.seq
26                 line = self.seq[self.phys_num]
27             except IndexError:
28                 # self.seq is finished, so break the loop if we have any
29                 # more data to return; else, reraise the exception, because
30                 # if we have no further data to return, we're finished too
31                 if result: break
32                 else: raise
33             self.phys_num += 1
34             if line.endswith('\\\n'):         #如果行尾有连接符号，仅添加本行内容到逻辑行list
35                 result.append(line[:-2])
36             else:
37                 result.append(line)           #否则，添加本行内容到逻辑行list，并且退出循环
38                 break
39         return ''.join(result)                #返回连接成的逻辑行字符串
40
41 # Here's an example function, showing off usage:
42 #使用方法
43 def show_logicals(fileob, numlines=5):
44     ll = LogicalLines(fileob)
45     for l in ll:
46         print "Log#%d, phys# %d: %s" % (
47             ll.logi_num, ll.phys_num, repr(l))
48         if ll.logi_num>numlines: break
49
50 if _ _name_ _=='_ _main_ _':
51     from cStringIO import StringIO
52     ff = StringIO(
53 r"""prima \
54 seconda \
55 terza
56 quarta \
57 quinta
58 sesta
59 settima \
60 ottava
61 """)
62     show_logicals( ff )
```

## 讨论 Discussion

This is another sequence-bunching problem, like Recipe 4.9. In Python 2.1, a class wrapper is the most natural approach to getting reusable code for sequence-bunching tasks. We need to support the sequence protocol ourselves and handle the sequence protocol in the sequence we wrap. In Python 2.1 and earlier, the sequence protocol is as follows: a sequence must be indexable by successively larger integers (0, 1, 2, ...), and it must raise an IndexError as soon as an integer that is too large is used as its index. So, if we need to work with Python 2.1 and earlier, we must behave this way ourselves and be prepared for just such behavior from the sequence we are wrapping.

In Python 2.2, thanks to iterators, the sequence protocol is much simpler. A call to the next method of an iterator yields its next item, and the iterator raises a StopIteration when it's done. Combined with a simple generator function that returns an iterator, this makes sequence bunching and similar tasks far easier:

Python 2.2中，由于有了iterator, 序列处理变得比较简单了。 对iterator的next方法的调用返回序列下一个元素，当序列结束时，抛出一个StopIteration异常。 使用一个简单的generator来返回iterator，可以超简单的处理序列成块适配以及类似任务:

```   1 from _ _future_ _ import generators
2
3 def logical_lines(fileobj):
4     logical_line = []
5     for physical_line in fileobj:
6         if physical_line.ends_with('\\\n'):
7             logical_line.append(physical_line[:-2])
8         else:
9             yield ''.join(logical_line)+physical_line
10             logical_line = []
11     if logical_line: yield ''.join(logical_line)                    #处理最后一行有r"\\n"?
```

Perl Cookbook Recipe 8.1.

#40 minutes

PyCkBk-4-10 (last edited 2009-12-25 07:09:08 by localhost)

• Page.execute = 3.135s
• getACL = 0.676s
• init = 0.065s