# Converting Between Different Naming Conventions 在不同的命名约定之间转换

## 问题 Problem

You have a body of code whose identifiers use one of the common naming conventions to represent multiple words in a single identifier (CapitalizedWords, mixedCase, or under_scores), and you need to convert the code to another naming convention in order to merge it smoothly with other code.

## 解决 Solution

re.sub covers the two hard cases, converting underscore to and from the others:

```   1 import re
2
3 def cw2us(x): # 首符大写形式 to 下划线连接形式
4     return re.sub(r'(?<=[a-z])[A-Z]|(?<!^)[A-Z](?=[a-z])',
5         r"_\g<0>", x).lower(  )
6
7 def us2mc(x): # 下划线连接形式 to 大小写混合形式
8     return re.sub(r'_([a-z])', lambda m: (m.group(1).upper(  )), x)
```

Mixed-case to underscore is just like capwords to underscore (the case-lowering of the first character becomes redundant, but it does no harm):

```   1 def mc2us(x): # mixed-case to underscore notation
2     return cw2us(x)
```

Underscore to capwords can similarly exploit the underscore to mixed-case conversion, but it needs an extra twist to uppercase the start:

```   1 def us2cw(x): # underscore to capwords notation
2     s = us2mc(x)
3     return s[0].upper(  )+s[1:]
```

Conversion between mixed-case and capwords is, of course, just an issue of lowercasing or uppercasing the first character, as appropriate:

```   1 def mc2cw(x): # mixed-case to capwords
2     return s[0].lower(  )+s[1:]
3
4 def cw2mc(x): # capwords to mixed-case
5     return s[0].upper(  )+s[1:]
```

## 讨论 Discussion

Here are some usage examples:

```>>> cw2us("PrintHTML")
'print_html'
>>> cw2us("IOError")
'io_error'
>>> cw2us("SetXYPosition")
'set_xy_position'
>>> cw2us("GetX")
'get_x'```

The set of functions in this recipe is useful, and very practical, if you need to homogenize naming styles in a bunch of code, but the approach may be a bit obscure.In the interest of clarity, you might want to adopt a conceptual stance that is general and fruitful.In other words, to convert a bunch of formats into each other, find a neutral format and write conversions from each of the N formats into the neutral one and back again.This means having 2N conversion functions rather than N x (N-1)梐 big win for large N梑ut the point here (in which N is only three) is really one of clarity.

Clearly, the underlying neutral format that each identifier style is encoding is a list of words.Let's say, for definiteness and without loss of generality, that they are lowercase words:

```   1
2 import string, re
3 def anytolw(x):  # any format of identifier to list of lowercased words
4
5     # First, see if there are underscores:
6     lw = string.split(x,'_')
7     if len(lw)>1: return map(string.lower, lw)
8
9     # No. Then uppercase letters are the splitters:
10     pieces = re.split('([A-Z])', x)
11
12     # Ensure first word follows the same rules as the others:
13     if pieces[0]: pieces = [''] + pieces
14     else: pieces = pieces[1:]
15
16     # Join two by two, lowercasing the splitters as you go
17     return [pieces[i].lower(  )+pieces[i+1] for i in range(0,len(pieces),2)]
```

There's no need to specify the format, since it's self-describing.Conversely, when translating from our internal form to an output format, we do need to specify the format we want, but on the other hand, the functions are very simple:

```   1 def lwtous(x): return '_'.join(x)
2 def lwtocw(x): return ''.join(map(string.capitalize,x))
3 def lwtomc(x): return x[0]+''.join(map(string.capitalize,x[1:]))
```

Any other combination is a simple issue of functional composition:

```   1 def anytous(x): return lwtous(anytolw(x))
2 cwtous = mctous = anytous
3 def anytocw(x): return lwtocw(anytolw(x))
4 ustocw = mctocw = anytocw
5 def anytomc(x): return lwtomc(anytolw(x))
6 cwtomc = ustomc = anytomc
```

The specialized approach is slimmer and faster, but this generalized stance may ease understanding as well as offering wider application.

