|
|
If you've been programming on a Linux system, you may be coding in C or C++. If you're a systems administrator, you may be programming in Perl, Tcl, Awk, or one of the various (sh/csh/bash) shell scripting languages. Maybe you wrote a script to do a particular job, but now find that it doesn't scale up very well. You might be writing C applications, but now wish you didn't have to be bogged down in the low level details. Or you may simply be intrigued by the possibility of doing high level, object oriented programming in a friendly, interpreted environment.
If any of the above applies to your situation, you may be interested in Python. Python is a powerful language for the rapid development of applications. The interpreter is easily extensible, and you may embed your favorite C code as a compiled extension module.
Python is not one of the research languages which seem to get promoted solely for pedagogical reasons. It is possible to do useful coding almost immediately. Python seems to encourage object oriented programming by clearing the paths, rather than erecting parapets.
$ python Python 1.2 (Jun 3, 1995) [GCC 2.6.3] Copyright 1991-1995 Sitchting Mathematisch Centrum, Amsterdam >>>> print 'hello, bruce' hello, bruce >>>> [CONTROL]-DMost Python programs, though developed incrementally, are executed as a normal script. The next program illustrates some extensions to the original. The new version will identify who you are, based on your user account in /etc/passwd.
1 #!/usr/local/bin/python 2 3 import posix 4 import string 5 6 uid = `posix.getuid()` 7 passwd = open('/etc/passwd') 8 for line in passwd.readlines(): 9 rec = string.splitfields(line, ':') 10 if rec[2] == uid: 11 print 'hello', rec[0], 12 print 'mind if we call you bruce?' 13 break 14 else: 15 print "I can't find you in /etc/passwd"A line by line explanation of the program is as follows:
1 | Command interpreter to invoke |
3-4 | Import two standard Python modules, posix and regsub |
6 | Get the user id using the posix module. The enclosing backticks (`) tell Python to assign this value as a string. |
7 | Open the /etc/passwd file in read mode. |
8 | Start a for loop, reading in all the lines of /etc/passwd. Compound statements, such as conditionals, have headers starting with a keyword if, while, for, try and end with a colon. |
9 | Each line in /etc/passwd is read and split into array rec[] based on a colon ':' boundary, using string.splitfields() |
10 | If rec[2] from /etc/passwd matches our call to posix.getuid() we have identified the user. The first 3 fields of /etc/passwd are: rec[0] = name, rec[1] = password, and rec[2] = uid |
11-12 | Print the user's account name to stdout. The trailing comma avoids the newline after the output. |
13 | Break the for loop. |
14-15 | Print message if we can't locate the user in /etc/passwd. |
The observant reader will note that the control statements lack any form of BEGIN/END keywords or matching braces. This is because the indentation defines the way statements are grouped. Not only does this eliminate the need for braces, but it enforces a readable coding style. No doubt this design feature will turn off a few potential Python hackers, but in practice, it is useful. I can think of numerous times I've spent tracking bugs in C resulting from misinterpreting code that looked like any of these fragments, usually deeply nested:
if (n == 0) x++; y--; z++;
if (m == n || (n != o && o == q)) { j++; } k++; q = 0;
while (y--) *ptr++; if (m == n) { x++; }
A coding style enforced in the language definition would have saved me much frustration. Python code written by another programmer is usually very readable.
print 'hello', pwd.getpwuid(posix.getuid())[0]
This points out another nicety about Python that is critical for any new language's success: the robustness of its library. As mentioned earlier, you may extend Python by adding a compiled extension module to your personal library, but in most cases you don't have to.
Take the ftplib module for instance. If you wanted to write a Python script to automatically download the latest FAQ, you can simply use ftplib in the following example:
#!/usr/local/bin/python from ftplib import FTP ftp = FTP('ftp.python.org') # connect to host ftp.login() # login anonymous ftp.cwd('pub/python/doc') # change directory ftp.retrlines('LIST') # list python/doc F = open('python.FAQ', 'w') # file: python.FAQ ftp.retrbinary('RETR FAQ', F.write, 1024) ftp.quit()
Python has numerous features which make programming fun and restore your perspective of the design objectives. The language encourages you to explore its features by writing experimental functions during program development. Several notable Python features:
1 #!/usr/local/bin/python 2 3 StackingException = 'StackingException' 4 5 class StackingThings: 6 names = ('llama', 'spam', '16 ton weight', \ 7 'dead parrot') 8 weights = {} 9 weights['llama'] = 300 10 weights['spam'] = 1 11 weights['16 ton weight'] = 32000 12 weights['dead parrot'] = 2 13 breakpt = {} # breaking points 14 breakpt['llama'] = 200 15 breakpt['spam'] = 1000 16 breakpt['16 ton weight'] = 1000000 17 breakpt['dead parrot'] = 15 18 19 def __init__(self): 20 self.items_stacked = [] 21 def add(self, item): 22 if item not in self.names: 23 raise StackingException, 24 item+'not a stackable object' 25 self.items_stacked.insert(0, item) 26 try: 27 self.test_strength(item) 28 except StackingException, val: 29 print item, val 30 def test_strength(self, item): 31 wt = 0 32 bp = 1000000 33 for i in self.items_stacked: 34 wt = wt + self.weights[i]<\n> 35 if wt > bp: 36 self.items_stacked.remove(item) 37 raise StackingException, \ 38 'exceeds breaking point!' 39 bp = self.breakpt[i] 40 41 # user code to test StackingThings class 42 43 s = StackingThings() 44 45 s.add'llama') 46 s.add('spam') 47 s.add('spam') 48 s.add('spam') 49 s.add('dead parrot') 50 s.add('16 ton weight') 51 52 print <'items stacked = ', s.items_stacked 53 54 try: 55 s.add('bad object') 56 except StackingException, msg: 57 print 'exception:', msgThis script produces the following output:
16 ton weight exceeds breaking point! items stacked = ['dead parrot', 'spam', 'spam', 'spam', 'llama'] exception: bad object not a stackable object
The StackingThings class itself consists of 3 methods: __init__(), add(), and test_strength(). When initiating StackingThings, we use the special __init__ method to create its initial state by initializing the list of stacked items: items_stacked = []. The add() method is essentially the only method that is accessed by the user of StackingThings. And test_strength() is called by add() to verify that we have not exceeded our breaking point.
The first argument to each method in our example is called self. This is just a convention, but it makes our code much more readable. The first argument to a Python method is used in a somewhat similar fashion as the this keyword in C++.
Python provides for exception handling, both built-in (e.g. ZeroDivisionError, TypeError, NameError, etc.) and user-defined exceptions. The latter is especially useful in developing robust classes. Python uses the try/except syntax for exception handling:
try: DenominateZero() except ZeroDivisionError, val: print 'Whoops:', val
Our add() method is used to try an exception in test_strength() and raise an exception when we pass it an illegal stacking item.
Two of the built-in methods for Python lists that are demonstrated in the example on lines 25 and 36 are insert() and remove(). Other supported operations on list objects include append(), count(), index(), reverse(), and sort().
The data attributes may be accessed by the methods of the class as well as the user code. Either print self.names within a class method or print s.names from the user code will print the list of legal stacking things:
['llama', 'spam', '16 ton weight', 'dead parrot']
I frequently deal with ICD-9-CM codes in medical applications. These codes are usually numeric, but sometimes alphanumeric. They usually have a decimal point, but sometimes don't. Some of the codes may be further subdivided into additional ICD-9 codes. Furthermore, codes are added and deleted periodically, but most don't change. Normally, the lookup of ICD-9 codes will be done in a relational database, but it is also convenient to use small data sets within an application. For example, given the dictionaries icd9 and subdivide:
x | subdivide[x] | icd9[x] |
'692' | 1 | 'Contact dermatitis' |
'692.0' | 0 | 'Due to detergents' |
'692.2' | 0 | 'Due to solvents' |
'692.7' | 1 | 'Due to solar radiation' |
'692.70' | 0 | 'Unspecified dermatitis' |
'692.71' | 0 | 'Sunburn' |
'692.72' | 0 | 'Other: Photodermatitis' |
We can manipulate the ICD-9 codes in the following manner:
for code in icd9.keys(): if subdivide[code]: print 'ICD-9',code,'may be further subdivided' else: print 'Description for',code,'is:',icd9[code]
This would produce the following output:
ICD-9 692.7 may be further subdivided Description for 692.70 is: Unspecified dermatitis Description for 692.0 is: Due to detergents ICD-9 692 may be further subdivided Description for 692.71 is: Sunburn Description for 692.2 is: Due to solvents Description for 692.72 is: Other: Photodermatitis
Lines 8-17 of our StackingThings example use dictionaries, but the initialization was broken into several lines for clarity. This could be reduced to:
weights = {'llama':300, 'spam':1, '16 ton weight':32000, 'dead parrot':2} breakpt = {'llama':200, 'spam':1000, '16 ton weight':1000000, 'dead parrot':15}
Finally, inheritance is provided in Python, although it is not demonstrated in this example. The derived class may override methods of its base class or classes (yes, multiple inheritance is supported in a limited form). In C++ parlance, all methods in a Python class are ``virtual''.
Python is extensible. If you can program in C, you can add a new low-level module to the interpreter. We are currently doing this at our company for a distributed database system. The Python interpreter will be the high-level command language for many of the applications.
In addition to Linux, Python runs on several other platforms: OS/2, Windows, Macintosh, and many flavors of Unix. And like Linux, all of these versions are freely available and distributable.
The documentation for Python is of a very high quality, written by Guido van Rossum, the creator of Python. Four separate user manuals in postscript format are available at the Python ftp site (see sidebar ``Python Information''). These documents have also been converted to HTML and Microsoft help file formats. A Python FAQ, quick reference guide, and testimonials are also available. O'Reilly and Associates also intends to publish Programming Python early next year.
Python has its own active newsgroup (comp.lang.python) as well as a mailing list which receives the same messages as the newsgroup. To subscribe to the mailing list, send mail to python-list-request@cwi.nl. Various Python special interest groups have been formed: Matrix-SIG, GUI-SIG, and Locator-SIG.
Finally, the Python Software Activity (PSA) has been established to foster the common interests of the Python development community. The PSA, unlike the GNU Project, does not do the actual development of software (although many of its members probably do), but rather acts as a clearinghouse for Python software modules developed by others. It also hosts workshops and related activities to help promote the use of the Python language. Additional information about the PSA may be obtained by visiting the Python home page: http://www.python.org.
Special thanks to Mark Lutz, Aaron Watters, the PSA, and, of course Guido van Rossum.
Jeff Bauer has spent the past 16 years developing health care software. His current project involves interfacing pen-based computers with Unix systems to track clinical information.