Much more is happening in the Python world than just the recent release of 3.0, momentous as that is. Python has Linux stories to tell in education, The Enterprise, and several other domains.
It's healthy to start, however, with an understanding of Python 3.0; its release is certainly among the five most important events in Python's two decades of history. Even though a great deal has already been written about 3.0, including an outstanding "What's New..." and tutorial, it deserves a profile for Regular Expressions readers who might be familiar with the language, but not expert in it.
What is a Language?
Senior Python implementers--founder Guido van Rossum and his "lieutenants"--have emphasized that Python 3.0 is a different language than all previous Pythons. What was
print "Hello, world."
before, now is
print("Hello, world.")
They're right, of course; the most important summary of Python 3.0 is that it's incompatible with previous Pythons, although a 2to3 source translator automates much or all of the changes. There's no expectation that any particular source will survive the transition unchanged. At the same time, "Python 3.0 still remains very much 'Pythonic'", as release-master Barry Warsaw echoed a common expression in the development team's release statement.
What does "Pythonic" mean? Think of it this way: as frequently taught in school, and as seemed more justified in industrial practice perhaps a quarter-century ago, it's syntax that characterizes a language. To understand a language, start with its BNF.
What Changes, What Remains
At the syntactic level, Python 3.0 unmistakeably and deliberately represents a break. Very few programs "work" in both Python 2.6 and Python 3.0. They're simply different.
A computing language has several dimensions beside syntax, though, and a fuller perspective helps understand the close connections between 3.0 and its predecessors. Beside syntax, crucial to a language are its:
- semantics;
- library;
- implementations;
- extensions; and
- community.
Python 3.0 has the same Pythonic "culture" as previous Pythons. While its collection of keywords has changed slightly, semantics are largely the same: the meaning of a Python 3.0 program is at most slightly different from that of the corresponding Python program.
Notice, among other things, how frequently the core developers refer to themselves as "the Python development team": not new and old, or before and after, or 3.0 and 2.6. 3.0 has been released, and work on its enhancement will continue. At the same time, 2.6 remains in active development. Its maintenance will continue, and by many of the same people. The languages are different, in that a particular program only works in 3.0 or 2.6, but not both--yet they are closely related, sharing a great deal of culture.
Python's run-time libraries change quite a bit with 3.0, but mostly in ways that appear cosmetic. Naming conventions are now observed more consistently: what was SocketServer is now socketserver, but its functionality is preserved. Internally, there's plenty of clean-up; usage need change only a little, though. Only a very few programmers will notice loss of support for Mac OS 9, for example. For most Python users, rationalizations like collection of of the historically distinct modules urllib, urllib2, and urlparse modules into the new urllib package is only a benefit.
For us, the biggest incompatibility at the source-code level with Python 3.0 has to do with strings. Even that label is a bit of a misnomer; as van Rossum vividly advises in the "What's New..." article mentioned above, "Everything you thought you knew about binary data and Unicode has changed." The good news, though, is that van Rossum goes on to detail exactly what a working programmer needs to know to make the transition: the essence is that strings are now Unicode. Note that the reference for 3.0's handling of character and binary data is the PEP 3137, which van Rossum wrote about a year ago.
The Importance of Extensions
The impact of the transition to 3.0 is hardest to understand for Python extensions; it's also the area which has spawned the most emotion and discord.
If languages competed in the '60s and '70s on syntax, and for their libraries in the next two decades, now much of the focus is on extensions. It's a commonplace to laud Perl for the riches of CPAN, Tcl is sometimes disparaged as significant only because of Expect, there are Ruby programmers who sincerely believe the only use of the language is in Rails, and so on. As much as Python promotes the "batteries included" in the standard library, production use of Python commonly depends on doman-specific binary extensions: Zope for Web application development, SciPy for engineering work, Twisted for network programming, and PyQt and others for graphical user interfaces are only a few of the most prominent extensions in wide use.
For most working programmers, rewrites to accomodate Python 3.0 syntax, including string handling, will range in difficulty from trivial to modest. The new libraries with 3.0 appear to be of high quality, and the changes they will require are small. Extensions, though, present significant uncertainty and risk. In general, 3.0-based extensions won't require changes in application code; most projects we've researched are planning 3.0 releases that are source-compatible. The difficulty is in scheduling availaibility of 3.0 releases. Large extensions require substantial investments to move from one release of Python to another, even within the 2.x series, and no one knows yet how much bigger the effort will be to move to 3.0. Zope, for instance, only achieved a 2.6-based "technology preview" in October 2008, half a year after the first alpha release of Python 2.6.
If you're new to Python, 3.0 is a great place to start learning. If you're experienced with Python, you'll likely pick up 3.0 quickly, and enjoy it. Be very careful in planning maintenance projects, though: talk with the managers of extensions on which you rely, and don't be surprised if they have no definite schedule for the transition to 3.0. For more on 3.0, see these comments by Jens Alfke and James Bennett.
Advances on Many Fronts
A symptom of Python's vitality is how much is going on apart from 3.0; Python has many independent centers of energy. Congratulations, for example, to Sidnei da Silva and everyone else involved in the Zope 2-for-2.6 release mentioned above; Zope is a crucial extension for thousands of Web sites, and its maintenance has been unexpectedly demanding.
Version 2.1.4 of was just released in mid-December 2008. Beautiful Soup has a well-earned reputation as a fantastic tool for Web scraping; as Ian Bicking points out, lxml is arguably better than Beautiful Soup.
Tryton is an intriguing framework for enterprise resource planning (ERP) and related applications which appears to be in active use in at least four European languages. ERP is a domain currently dominated by products with proprietary and very expensive licenses; the whole area seems ripe for open-source-based advance, and Tryton might well play a large role in that.
Also, Python is making in-roads in the classroom; quite a few colleges appear to be teaching it in one guise or another. A marker of this acceptance was the publication last summer in IEEE Computer of an article titled "In Praise of Scripting...", which Lambda the Ultimate discussed. Moreover, books on or about Python continue to appear monthly, including Expert Python Programming in September 2008, Python for Unix and Linux System Administration in August 2008, and Essential SQLAlchemy in June 2008. Several books covering 3.0 just made it to bookstore shelves: A Byte of Python has sections both on 3.0 and 2.6, and is available both for downloading and on paper. Programming in Python 3 just appeared in December 2008; it methodically treats most of the topics newcomers to the language want to know. Beginning Python: From Novice to Professional is a massive treatment which became available in September 2008 both in paper and as an eBook.
Kathryn and Cameron run their own consultancy, Phaseit, Inc., specializing in high-reliability and high-performance applications managed by high-level languages. They write about high-level languages and related topics in their "Regular Expressions" columns.
