-- 0.706 [2004-09-26 17:12:42]

# Converting Between Different Naming Conventions 在不同的命名约定之间转换

Credit: Sami Hangaslammi

## 问题 Problem

You have a body of code whose identifiers use one of the common naming conventions to represent multiple words in a single identifier (CapitalizedWords, mixedCase, or under_scores), and you need to convert the code to another naming convention in order to merge it smoothly with other code.

## 解决 Solution

re.sub covers the two hard cases, converting underscore to and from the others:

re.sub 包含两种很难(理解)的情形, 将'下划线连接形式'(underscore)转换成其它形式和从其它形式转换成'下划线连接形式'(underscore):

```   1 import re
2
3 def cw2us(x): # 首符大写形式 to 下划线连接形式
4     return re.sub(r'(?<=[a-z])[A-Z]|(?<!^)[A-Z](?=[a-z])',
5         r"_\g<0>", x).lower(  )
6
7 def us2mc(x): # 下划线连接形式 to 大小写混合形式
8     return re.sub(r'_([a-z])', lambda m: (m.group(1).upper(  )), x)
```

Mixed-case to underscore is just like capwords to underscore (the case-lowering of the first character becomes redundant, but it does no harm):

'大小写混合形式'到'下划线连接形式'的转换,正类似于'首符大写形式'到'下划线连接形式':(变第一个字符为小写成为多余，但是它没有害处)

```   1 def mc2us(x): # mixed-case to underscore notation
2     return cw2us(x)
```

Underscore to capwords can similarly exploit the underscore to mixed-case conversion, but it needs an extra twist to uppercase the start:

'下划线连接形式'到'首符大写形式' 能同样地使用对'下划线连接形式'到'大小写混合形式'的转换,但是它需要额外的把开头变为大写字母:

```   1 def us2cw(x): # underscore to capwords notation
2     s = us2mc(x)
3     return s[0].upper(  )+s[1:]
```

Conversion between mixed-case and capwords is, of course, just an issue of lowercasing or uppercasing the first character, as appropriate:

```   1 def mc2cw(x): # mixed-case to capwords
2     return s[0].lower(  )+s[1:]
3
4 def cw2mc(x): # capwords to mixed-case
5     return s[0].upper(  )+s[1:]
```

## 讨论 Discussion

Here are some usage examples:

```>>> cw2us("PrintHTML")
'print_html'
>>> cw2us("IOError")
'io_error'
>>> cw2us("SetXYPosition")
'set_xy_position'
>>> cw2us("GetX")
'get_x'```

The set of functions in this recipe is useful, and very practical, if you need to homogenize naming styles in a bunch of code, but the approach may be a bit obscure.In the interest of clarity, you might want to adopt a conceptual stance that is general and fruitful.In other words, to convert a bunch of formats into each other, find a neutral format and write conversions from each of the N formats into the neutral one and back again.This means having 2N conversion functions rather than N x (N-1)梐 big win for large N梑ut the point here (in which N is only three) is really one of clarity.

Clearly, the underlying neutral format that each identifier style is encoding is a list of words.Let's say, for definiteness and without loss of generality, that they are lowercase words:

```   1
2 import string, re
3 def anytolw(x):  # any format of identifier to list of lowercased words
4
5     # First, see if there are underscores:
6     lw = string.split(x,'_')
7     if len(lw)>1: return map(string.lower, lw)
8
9     # No. Then uppercase letters are the splitters:
10     pieces = re.split('([A-Z])', x)
11
12     # Ensure first word follows the same rules as the others:
13     if pieces[0]: pieces = [''] + pieces
14     else: pieces = pieces[1:]
15
16     # Join two by two, lowercasing the splitters as you go
17     return [pieces[i].lower(  )+pieces[i+1] for i in range(0,len(pieces),2)]
```

There's no need to specify the format, since it's self-describing.Conversely, when translating from our internal form to an output format, we do need to specify the format we want, but on the other hand, the functions are very simple:

```   1 def lwtous(x): return '_'.join(x)
2 def lwtocw(x): return ''.join(map(string.capitalize,x))
3 def lwtomc(x): return x[0]+''.join(map(string.capitalize,x[1:]))
```

Any other combination is a simple issue of functional composition:

```   1 def anytous(x): return lwtous(anytolw(x))
2 cwtous = mctous = anytous
3 def anytocw(x): return lwtocw(anytolw(x))
4 ustocw = mctocw = anytocw
5 def anytomc(x): return lwtomc(anytolw(x))
6 cwtomc = ustomc = anytomc
```

The specialized approach is slimmer and faster, but this generalized stance may ease understanding as well as offering wider application.

The Library Reference sections on the re and string modules.

PyCkBk-3-16 (last edited 2009-12-25 07:11:00 by localhost)

• Page.execute = 0.062s
• getACL = 0.011s
• init = 0.001s