尚未完全完工,先放这儿待续,若发现有错误的地方请直接修改之,谢谢!有愿意帮助完成的,也谢谢啦!

11.4 The Python Database API (DBAPI) 2.0

我前面提到过,Python 标准库本身没有提供任何关系数据库接口。因为有很多很多第三方模块让你很方便的访问特定的数据库。这些模块大部分都遵守 Python Database API 2.0 标准,该标准又称为 DBAPI.

到本书写作期间,Python 的 DBAPI 标准委员会 (SIG) 正忙于准备发布新版本的 DBAPI (正式发布时可能就叫 3.0). 这个版本将会提供有限的或者全部的向下兼容,也就是说依据 DBAPI 2.0 开发的软件通常都能继续在支持 DBAPI 3.0 的模块上工作。毫无疑问,新版本会提供更强的特性,具有显著的性能提升。

   这本书已经出版两年多了,DBAPI3.0我怎么还没看到? --WeiZhong

如果你的 Python 程序仅运行在 Windows 上,你可能宁愿通过 COM 使用 Microsoft's ADO 包来访问数据库。要了解这方面的详细信息,你可以阅读这本书《Python Programming on Win32》, 由于 ADO 和 COM 是依赖特定平台的,而本书的主题是 Python 的跨平台应用,因此本书的内容并不包括 COM 和 ADO。

导入一个 DBAPI-兼容模块之后,你就可以调用该模块的 connect 函数,当然要提供必要的合适的参数。若连接成功,该函数会返回一个 Connect 对象。Connect对象提供有 commit 和 rollback 方法供你处理事务。该对象还有一个 cursor 方法,用来返回一个 cursor 对象。cursor对象的数个方法和属性方便你进行任意的数据库操作,一个 DBAPI-兼容模块也提供异常类,描述性属性,工厂函数及类型描述属性。

11.4.1 Exception Classes

A DBAPI-兼容模块以类的形式提供 Warning, Error, 和几个 Error 的子类. Warning 用来指出类似插入数据时数据被截短这样的情况, Error 的子类则表示发生了某种错误。你可以用下面这样的形式来使用这些异常类:

try: 
    ...
except module.Error, err: 
    ...

11.4.2 Thread Safety

如果一个 DBAPI-兼容模块有一个属性 threadsafety 并且其值大于0 ,就表明该模块的数据库连接是线程安全的(在某种层次上)。 最好不要依赖这个属性,通过确保一个线程独占访问任何外部资源(比如数据库)会更安全和更具有可移植性。

11.4.3 参数风格

一个 DBAPI-兼容的模块有一个属性 paramstyle 标识SQL语句中参数标记符号的风格. 通过在SQL语句中插入这样的标识符号你就能够在运行时提供动态参数进行查询。举例来说,你需要在数据库的 ATABLE 表中取出字段 AFIELD 的值等于Python变量 x 的数据行.假设 cursor 实例对象的名字为 c 则你可以通过 Python 字符串格式符号 % 来进行这种替换。

c.execute('SELECT * FROM ATABLE WHERE AFIELD=%r' % x)

这种方式工作得很好,不过这不是推荐的方式。这种方式会生成一个新的SQL语句字符串,而不是修改原有的字符串。对 x 的每个可能值,都需要对这个SQL语句进行一次解析和重新生成工作。通过参数转换方式而不是字符串格式化方式,传递给 execute 方法的是一个带有占位符的字符串,而不是已经替换好的字符串。这样 execute 方法只需要对SQL语句解析和执行一次,对提高查询性能会有一定帮助。举例来说,如果模块的 paramstyle 属性是 'qmark', 你可以用下面方式进行刚才的查询:

c.execute('SELECT * FROM ATABLE WHERE AFIELD=?', [x])

paramstyle 属性

只读属性 paramstyle 会告诉你使用该模块执行SQL语句时如何提供运行时参数. 可能的值有以下几种:

format

参数标记符号为 %s, 就象一个字符串格式符号. 示例如下:

c.execute('SELECT * FROM ATABLE WHERE AFIELD=%s', [x])

named

参数标记符号为 :name,每个参数都是命名参数. 示例如下:

c.execute('SELECT * FROM ATABLE WHERE AFIELD=:x', {'x':x})

numeric

参数标记符号为 :n, 给定参数的序号.示例如下:

c.execute('SELECT * FROM ATABLE WHERE AFIELD=:1', [x])

pyformat

参数标记符号为 %(name)s, 每个参数都是命名参数. 示例如下:

c.execute('SELECT * FROM ATABLE WHERE AFIELD=%(x)s', {'x':x})

qmark

参数标记符号为 ?. 示例如下:

c.execute('SELECT * FROM ATABLE WHERE AFIELD=?', [x])

当 paramstyle 为非命名参数时, execute 方法的第二个参数是一个序列。反之则是一个字典。

11.4.4 Factory Functions

Parameters passed to the database via placeholders must typically be of the right type. This means Python numbers (integers or floating-point values), strings (plain or Unicode), and None to represent SQL NULL. Python has no specific types to represent dates, times, and binary large objects (BLOBs). A DBAPI-compliant module supplies factory functions to build such objects. The types used for this purpose by most DBAPI-compliant modules are those supplied by module mxDateTime, covered in Chapter 12, and strings or buffer types for BLOBs. The factory functions are as follows.

Binary


Binary(string)

Returns an object representing the given string of bytes as a BLOB.

Date


Date(year,month,day)

Returns an object representing the specified date.

DateFromTicks


DateFromTicks(s)

Returns an object representing the date that is s seconds after the epoch of module time, covered in Chapter 12. For example, DateFromTicks(time.time( )) is today's date.

Time


Time(hour,minute,second)

Returns an object representing the specified time.

TimeFromTicks


TimeFromTicks(s)

Returns an object representing the time that is s seconds after the epoch of module time, covered in Chapter 12. For example, TimeFromTicks(time.time( )) is the current time.

Timestamp


Timestamp(year,month,day,hour,minute,second)

Returns an object representing the specified date and time.

TimestampFromTicks


TimestampFromTicks(s)

Returns an object representing the date and time that is s seconds after the epoch of module time, covered in Chapter 12. For example, TimestampFromTicks(time.time( )) is the current date and time.

11.4.5 Type Description Attributes A Cursor instance's attribute description describes the types and other characteristics of each column of a query. Each column's type (the second item of the tuple describing the column) equals one of the following attributes of the DBAPI-compliant module:

BINARY Describes columns containing BLOBs

DATETIME Describes columns containing dates, times, or both

NUMBER Describes columns containing numbers of any kind

ROWID Describes columns containing a row-identification number

STRING Describes columns containing text of any kind

A cursor's description, and in particular each column's type, is mostly useful for introspection about the database your program is working with. Such introspection can help you write general modules that are able to work with databases that have different schemas, schemas that may not be fully known at the time you are writing your code.

11.4.6 connect 函数

一个 DBAPI-compliant 兼容的模块 connect 函数接受的参数与要连接的具体数据库有关. DBAPI 标准建议但不强制,该连接函数应该接受命名参数.特别的, connect 应该至少接受以下命名的可选参数:

database 要连接的数据库的名字

dsn Data-source name to use for the connection

host 主机名

password 口令

user 用户名

11.4.7 Connection Objects

A DBAPI-compliant module's connect 函数返回 Connection 类的实例 x . x 支持下列方法

x.close( )

关闭数据库连接并释放一切相关资源.当你做完必要的数据库操作之后,应该养成立刻切断数据库连接的习惯,否则会造成不必要的性能问题.

x.commit( )

提交当前的事务. 如果数据库不支持事务, x.commit( ) 就什么也不做.

x.cursor()

返回类 Cursor 的一个新的实例, 详情参见后文

x.rollback( )

取消当前的事务.如果数据库不支持事务, x.rollback( ) 会引发一个异常. DBAPI 建议但不强制,对于不支持事务的数据库, Connection 类最好是根本不提供 rollback 方法, 这样 x.rollback( ) 会引发 AttributeError 异常. 用 hasattr(x,'rollback') 函数可以方便的测试该对象是否支持 rollback 方法.

11.4.8 Cursor Objects

Connection 实例的 cursor 方法返回的对象 c 是一个Cursor实例. A SQL cursor represents the set of results of a query and lets you work with the records in that set, in sequence, one at a time. A cursor as modeled by the DBAPI is a richer concept, since it also represents the only way in which your program executes SQL queries in the first place. On the other hand, a DBAPI cursor allows you only to advance in the sequence of results (some relational databases, but not all, also provide richer cursors that are able to go backward as well as forward), and does not support the SQL clause WHERE CURRENT OF CURSOR. These limitations of DBAPI cursors enable DBAPI-compliant modules to provide cursors even on RDBMSes that provide no real SQL cursors at all. An instance of class Cursor c supplies many attributes and methods; the most frequently used ones are documented here.

c.close( )

销毁对象 c 并释放相关资源.

description

一个只读属性,提供最后一次查询的字段信息。这是一个由7个元素组成的元组的序列。这七个元素分别是字段的:

name, typecode, displaysize, internalsize, precision, scale, nullable

信息。如果 c 根本没有执行过 execute 方法,或者最后一次查询没有返回任何结果,则 c.description 的值为 None。 A cursor's description is mostly useful for introspection about the database your program is working with. Such introspection can help you write general modules that are able to work with databases that have different schemas, including schemas that may not be fully known at the time you are writing your code.

c.execute(statement,parameters=None)

执行一个 SQL 语句。parameters 是可选的参数。当模块的 paramstyle 属性值为 'format', 'numeric', 或 'qmark' 时,parameters 是一个序列;当模块的 paramstyle 属性值为 'named' 或 'pyformat' 时,parameters 是一个字典。

c.executemany(statement,*parameters)

以不同的参数多次执行同一个SQL语句。*parameters 是一个参数序列。当模块的 paramstyle 属性值为 'format', 'numeric', 或 'qmark' 时,*parameters 是一个序列的序列;当模块的 paramstyle 属性值为 'named' 或 'pyformat' 时,parameters 是一个字典的序列。举例来说:

c.executemany('UPDATE atable SET x=? WHERE y=?',(12,23),(23,34))

该例子中模块的 paramstyle 属性值为 'qmark', 该语句等价于下面这两条语句,不过有可能运行的更快。

c.execute('UPDATE atable SET x=12 WHERE y=23')
c.execute('UPDATE atable SET x=23 WHERE y=34')

c.fetchall( )

以一个元组序列的形式返回最后一次查询的全部剩余结果行。若最后一次查询不是 SELECT 查询,则引发异常。

c.fetchmany(n)

以一个元组序列的形式返回最后一次查询的不超过 n 的剩余结果行。若最后一次查询不是 SELECT 查询,则引发异常。

c.fetchone( )

以一个元组的形式返回最后一次查询下一结果行。若最后一次查询不是 SELECT 查询,则引发异常。

c.rowcount

一个只读属性(长整数). 提供最后一次查询返回的结果行数或影响的行数。如果模块不能确定该值,则返回 -1

11.4.9 DBAPI-Compliant Modules

Whatever relational database you want to use, there's at least one (and often more than one) DBAPI-compliant module that you can download from the Internet. All modules listed in the following sections, except mxODBC, have liberal licenses that are mostly similar to Python's license (the SAP DB, however, is licensed under GPL) and that let you use them freely in either open source or closed source programs. mxODBC can be used freely for noncommercial purposes, but you must purchase a license for any commercial use. There are so many relational databases that it's impossible to list them all, but here are some of the most popular ones:

ODBC Open DataBase Connectivity (ODBC) is a popular standard that lets you connect to many different relational databases, including ones not otherwise supported by DBAPI-compliant modules, such as Microsoft Jet (also known as the Access database). The Windows distribution of Python contains an odbc module, but the module is unsupported and complies to an older version of the DBAPI, not to the current version 2.0. On either Unix or Windows, use mxODBC, available at http://www.lemburg.com/files/Python/mxODBC.html. mxODBC's paramstyle is 'qmark'. Its connect function accepts three optional arguments, named dsn, user, and password.

Oracle Oracle is a widespread, commercial RDBMS. To interface to Oracle, I recommend module DCOracle2, available at http://www.zope.org/Members/matt/dco2. DCOracle2's paramstyle is 'numeric'. Its connect function accepts a single optional, unnamed argument string with the syntax:

'user/password@service' Microsoft SQL Server To interface to Microsoft SQL Server, I recommend module mssqldb, available at http://www.object-craft.com.au/projects/mssql/. mssqldb's paramstyle is 'qmark'. Its connect function accepts three arguments, named dsn, user, and passwd, as well as an optional database argument.

DB2 For IBM DB/2, try module DB2, available at ftp://people.linuxkorea.co.kr/pub/DB2/. DB2's paramstyle is 'format'. Its connect function accepts three optional arguments, named dsn, uid, and pwd.

MySQL MySQL 是一个广泛应用的开源的 RDBMS. 要在Python中使用 MySQL, 试试 MySQLdb 模块, 该模块的站点是 http://sourceforge.net/projects/mysql-python. MySQLdb 的paramstyle 是 'format'. 它的 connect 函数接受四个可选命名参数( db, host, user, 和 passwd).

PostgreSQL PostgreSQL is an excellent open source RDBMS. To interface to PostgreSQL, I recommend psycopg, available at http://initd.org/Software/psycopg. psycopg's paramstyle is 'pyformat'. Its connect function accepts a single mandatory argument, named dsn, with the syntax:

'host=host dbname=dbname user=username password=password' SAP DB SAP DB, once known as Adabas, is a powerful RDBMS that used to be closed source, but is now open source. SAP DB comes with sapdbapi, available at http://www.sapdb.org/sapdbapi.html, as well as other useful Python modules. sapdbapi's paramstyle is 'pyformat'. Its connect function accepts three mandatory arguments, named user, password, and database, and an optional argument named host.

11.4.10 Gadfly

Gadfly, available at http://gadfly.sf.net, is not an interface to some other RDBMS, but rather a complete RDBMS engine written in Python. Gadfly supports a large subset of standard SQL. For example, Gadfly lacks NULL, but it does support VIEW, which is a crucial SQL feature that engines such as MySQL still lack at the time of this writing. Gadfly can run as a daemon server, to which clients connect with TCP/IP. Alternatively, you can run the Gadfly engine directly in your application's process, if you don't need other processes to be able to access the same database concurrently.

The gadfly module has several discrepancies from the DBAPI 2.0 covered in this chapter because Gadfly implements a variant of the older DBAPI 1.0. The concepts are quite close, but several details differ. The main differences are:

gadfly does not supply custom exception classes, so Gadfly operations that fail raise normal Python exceptions, such as IOError, NameError, etc.

gadfly does not supply a paramstyle attribute. However, the module behaves as if it supplied a paramstyle of 'qmark'.

gadfly does not supply a function named connect; use the gadfly.gadfly or gadfly.client.gfclient functions instead.

gadfly does not supply factory functions for data types.

Gadfly cursors do not supply the executemany method. Instead, in the specific case in which the SQL statement is an INSERT, the execute method optionally accepts as its second argument a list of tuples and inserts all the data.

Gadfly cursors do not supply the rowcount method.

The gadfly module supplies the following functions.

gadfly


gadfly.gadfly(dbname,dirpath)

Returns a connection object for the database named dbname, which must have been previously created in the directory indicated by string dirpath. The database engine runs in the same process as your application.

gfclient


gadfly.client.gfclient(policyname, port, password, host)

Returns a connection object for the database served by a gfserve process on the given host and port. policyname identifies the level of access required, and is often 'admin' to specify unlimited access.

PyDBAPI2 (last edited 2009-12-25 07:14:13 by localhost)