1. 25.4. Pyrex

The Pyrex language (http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/) is often the most convenient way to write extensions for Python. Pyrex is a large subset of Python, with the addition of optional C-like types for variables: you can automatically compile Pyrex programs (source files with extension .pyx) into machine code (via an intermediate stage of generated C code), producing Python-importable extensions. See the above URL for all the details of Pyrex programming; in this section, I cover only a few essentials to let you get started with Pyrex.

The limitations of the Pyrex language, compared with Python, are the following:

    • No nesting of def and class statements in other statements (except that one level of def within one level of class is okay, and indeed is the proper and normal way to define a class's methods).
    • No import *, generators, list comprehensions, decorators, or augmented assignment.
    • No globals and locals built-ins.
    • To give a class a staticmethod or classmethod, you must first def the function outside the class statement (in Python, it's normally defed within the class).

As you can see, while not quite as rich as Python proper, Pyrex is a vast subset indeed. More importantly, Pyrex adds to Python a few statements that allow C-like declarations, enabling easy generation of machine code (via an intermediate C-code generation step). Here is a simple example; code it in source file hello.pyx in a new empty directory:

def hello(char *name):

  • return "Hello, " + name + "!"

This is almost exactly like Pythonexcept that parameter name is preceded by a char*, declaring that its type must always be a C 0-terminated string (but, as you see from the body, in Pyrex, you can use its value as you would a normal Python string).

When you install Pyrex (by the usual python setup.py install route), you also gain a way to build Pyrex source files into Python dynamic-library extensions through the usual distutils approach. Code the following in file setup.py in the new directory:

from distutils.core import setup from distutils.extension import Extension from Pyrex.Distutils import build_ext

setup(name='hello',ext_modules=[Extension("hello",hello.pyx)],

  • cmdclass = {'build_ext': build_ext})

Now run python setup.py install in the new directory (ignore compilation warnings; they're fully expected and benign). Now you can import and use the new modulefor example, from an interactive Python session:

>>> import hello >>> hello.hello("Alex") 'Hello, Alex!'

Due to the way we've coded this Pyrex source, we must pass a string to hello.hello: passing no arguments, more than one argument, or a nonstring raises an exception:

>>> hello.hello( ) Traceback (most recent call last):

  • File "<stdin>", line 1, in ?

TypeError: function takes exactly 1 argument (0 given) >>> hello.hello(23) Traceback (most recent call last):

  • File "<stdin>", line 1, in ?

TypeError: argument 1 must be string, not int

25.4.1. The cdef Statement and Function Parameters

You can use the keyword cdef mostly as you would def, but cdef defines functions that are internal to the extension module and are not visible on the outside, while def functions can also be called by Python code that imports the module. For both types of functions, parameters and return values with unspecified types, or, better, ones explicitly typed as object, become PyObject* pointers in the generated C code (with implicit and standard handling of reference incrementing and decrementing). cdef functions may also have parameters and return values of any other C type; def functions, in addition to untyped (or, equivalently, object), can only accept int, float, and char* types. For example, here's a cdef function to specifically sum two integers:

cdef int sum2i(int a, int b):

  • return a + b

You can also use cdef to declare C variables: scalars, arrays, and pointers like in C:

cdef int x, y[23], *z

and struct, union, and enum with Pythonic syntax (colon on head clause, then indent):

cdef struct Ure:

  • int x, y float z

(Afterward, refer to the new type by name onlye.g., Ure. Never use the keywords struct, union, and enum anywhere outside of the cdef that declares the type.)

1.1. 25.4.1.1. External declarations

To interface with external C code, you can declare variables as cdef extern, with the same effect that extern has in the C language. More commonly, you will have the C declarations regarding some library you want to use available in a .h C header file; to ensure that the Pyrex-generated C code includes that header file, use the following form of cdef:

cdef external from "someheader.h":

and follow with a block of indented cdef-style declarations (without repeating the keyword cdef in the block). You need only declare the functions and variables that you want to use in your Pyrex code; Pyrex does not read the C header file, but rather trusts your Pyrex declarations in the block, without generating any C code for them. Do not use const in your Pyrex declarations, since const is not a keyword known to Pyrex!

1.2. 25.4.1.2. cdef classes

A cdef class statement lets you define a new Python type in Pyrex. It may include cdef declarations of attributes (which apply to every instance, not to the type as a whole), which are normally invisible from Python code; however, you can specifically declare attributes as cdef public to make them normal attributes from Python's viewpoint, or cdef readonly to make them visible but read-only from Python (such Python-visible attributes must be numbers, strings, or objects).

A cdef class supports special methods (with some caveats), properties (with a special syntax), and inheritance (single inheritance only). To declare a property, use the following within the body of the cdef class:

property name:

then use indented, def statements for methods _ _get_ _(self), _ _set_ _(self, value), and _ _del_ _(self) (you may omit one or more of these methods if the property must not allow setting or deletion).

A cdef class's _ _new_ _ is different from that of a normal Python class: the first argument is self, the new instance, already allocated and with its memory filled with 0s. At object destruction time, Pyrex calls a special method _ _dealloc_ _(self) to let you undo whatever allocations _ _new_ _ may have done (cdef classes have no _ _del_ _ special method).

There are no righthand-side versions of arithmetic special methods, such as _ _radd_ _ to go with _ _add_ _, like in Python; rather, if (say) a+b can't find or use type(a)._ _add_ _, it next calls type(b)._ _add_ _(a, b)note the order of arguments (no swapping!). You may need to attempt some type checking to ensure that you perform the correct operation in all cases.

To make the instances of a cdef class into iterators, define a special method _ _next_ _(self), not a method called next as you would do in Python.

Here is a Pyrex equivalent of Example 25-2:

cdef class intpair:

  • cdef public int first, second def _ _init_ _(self, first, second):
    • self.first = first self.second = second
    def _ _repr_ _(self):
    • return 'intpair(%s,%s)' % (self.first, self.second)

Like the C-coded extension in Example 25-2, this Pyrex-coded extension also offers no substantial advantage with respect to a Python-coded equivalent. However, note that the simplicity and conciseness of the Pyrex code is much closer to that of Python than to the verbosity and boilerplate needed in C, and yet the machine code generated from this Pyrex file is very close to what gets generated from the C code in Example 25-2. 25.4.2. The ctypedef Statement

You can use the keyword ctypedef to declare a name (synonym) for a type:

ctypedef char* string

1.3. 25.4.3. The for...from Statement

In addition to the usual Python statements, Pyrex allows another form of for statement:

for variable from lower_expression< =variable<upper_expression:

This is the most common form, but you can use either < or <= on either side of the variable after the from keyword; alternatively, you can use > and/or >= to have a backward loop (but you cannot mix a < or <= on one side and > or >= on the other). This statement is much faster than the usual Python for variable in range(...):, as long as the variable and both loop boundaries are all C-kind ints. 25.4.4. Pyrex Expressions

In addition to Python expression syntax, Pyrex can use some, but not all, of C's additions to it. To take the address of variable var, use &var, like in C. To dereference a pointer p, however, use p[0]; the C syntax *p is not valid Pyrex. Similarly, where in C you would use p->q, use p.q in Pyrex. The null pointer uses the Pyrex keyword NULL. For character constants, use the syntax c'x'. For casts, use angle bracketssuch as <int>somefloat where in C you would code (int)somefloatalso, use only casts on C values and onto C types, never with Python values and types (let Pyrex perform type conversion for you automatically when Python values or types occur). 25.4.5. A Pyrex Example: Greatest Common Divisor

Euclid's algorithm for GCD (Greatest Common Divisor) of two numbers is, of course, quite simple to implement in pure Python:

def gcd(dividend, divisor):

  • remainder = dividend % divisor while remainder:
    • dividend = divisor divisor = remainder remainder = dividend % divisor
    return divisor

The Pyrex version is very similar:

def gcd(int dividend, int divisor):

  • cdef int remainder remainder = dividend % divisor while remainder:
    • dividend = divisor divisor = remainder remainder = dividend % divisor
    return divisor

On my laptop, gcd(454803,278255) takes about 6 microseconds in the Python version, while the Pyrex version takes 1.75 microseconds. A speed three to four times faster for so little effort can be well worth the bother (assuming, of course, that this function takes up a substantial fraction of your program's execution time!), even though the pure Python version could have several practical advantages (it runs in Jython or IronPython just as well as in Classic Python, it works with longs just as well as with ints, it's perfectly cross-platform, and so on).