PyCon 2004
March 24, 2004
A.M. Kuchling
www.amk.ca
amk @ amk.ca
Quixote is a Web development framework written in Python.
Some of Quixote's goals:
Related tools:
Sites:
Applications:
Quixote was originally written for the MEMS Exchange, a project that aims to implement a network for distributed semiconductor fabrication, a network coordinated over the web. For more information about the architecture we used on that project, see "The MEMS Exchange Architecture", a paper presented at PyCon 2003.
Linux Weekly News is the highest-traffic Quixote site, and demonstrates that Quixote can be pretty scalable. Using Quixote and mod_python, LWN survived a Slashdotting while running on a relatively small machine, a 1GHz Pentium with 512Mb of RAM.
Most Quixote projects are for internal use. One publicly available project is Cartwheel, which performs genomic sequence analysis. I'm working on a Slashdot clone named Solidus, and hope to have an alpha version available before PyCon.
http://example.com/catalog/item/details
→ ['catalog', 'item', 'details']
http://example.com/ → store.web._q_index() http://.../catalog/ → store.web.catalog() or store.web.catalog._q_index /catalog/item/ → store.web.catalog.item() or store.web.catalog.item._q_index()
output = store.web.catalog(request)
Quixote applications are Python packages, so they can be installed using the Distutils and similar tools. Incoming HTTP requests are mapped to a chunk of Python code, which is executed and passed an object representing the contents of the request; the code returns a string containing the contents that will be returned to the client.
The code to be run to determined like this.
store.web
.
/catalog/item/details
becomes ['catalog', 'item', 'details']
.
store.web
, and finds the corresponding object.
store.web.catalog
, and so forth.
__call__
method -- it's called. If not,
Quixote looks for a _q_index
method.
request
,
and must return a string. (Actually, it can also return an instance
of a Stream
class in order to return streaming output.)
This is something like Zope's traversal, but the rules are simpler; applications can't change this algorithm or override it. There are still some special names, though, that we'll look at after providing a simple example.
From quixote/demo/__init__.py:
# Every publicly accessible attribute has to be listed in _q_exports. _q_exports = ["simple"] def _q_index (request): return """...""" def simple (request): # This function returns a plain text document, not HTML. request.response.set_content_type("text/plain") return "This is the Python function 'quixote.demo.simple'.\n"
Because Quixote publishes the contents of Python modules, there has
to be a way of declaring which functions should be considered public
and can be called through HTTP requests. This is done by listing the
public names in a _q_exports
module variable or object
attribute; Quixote will not traverse into an object or module that
lacks a _q_exports
attribute.
All the names special to Quixote begin with _q_.
_q_index
: If traversal ends up at an object that isn't
callable, this name is checked for and called.
_q_lookup
: if an attribute isn't found,
this name is checked for and called with the attribute.
_q_access
: at every step, this name is checked for
and called to perform access checks.
_q_resolve
: like a memoized version of _q_lookup
(rarely used)
This example handles URLs such as /whatever/1/, .../2/, etc.
def _q_lookup (request, component): try: key = int(component) except ValueError: raise TraversalError("URL component is not an integer") obj = ... database lookup (key) ... if obj is None: raise TraversalError("No such object.") # Traversal will continue with the ObjectUI instance return ObjectUI(obj)
_q_access
is always called before traversing any further.
This example requires that all users must be logged in.
from quixote.errors import AccessError, TraversalError def _q_access (request): if request.session.user is None: raise AccessError("You must be logged in.") # exits quietly if nothing is wrong def _q_index [html] (request): """Here is some security-critical material ..."""
_q_access
is used to impose an access control
condition on an entire object; this saves the user from having to add
access control checks to each attribute and running the risk of
forgetting one. At every step of traversal, _q_access
is
checked for and called if present. The function can raise an
exception to abort further traversal; if no exception is raised,
any return value is ignored.
.response
-- a HTTPResponse instance
.session
-- a Session instance
request.get_environ('SERVER_PORT', 80)
request.get_form_var('user')
-- get form variables
request.get_cookie('session')
-- get cookie values
request.get_url(n=1)
request.get_accepted_types()
browser, version = request.guess_browser_version()
return request.redirect('../../catalog')
.headers
-- dict of HTTP headers
.cache
-- number of seconds to cache response
.set_content_type('text/plain')
.set_cookie('session', '12345', path='/')
.expire_cookie('session')
Instead of Publisher
, use SessionPublisher
:
from quixote.publisher import SessionPublisher app = SessionPublisher('quixote.demo')
The request will then have a .session
attribute
containing a Session
instance.
Two other classes:
SessionManager
-- stores/retrieves sessions
Session
-- can hold .user
attribute
SessionManager is a dictionary-like object responsible for storing sessions. The default implementation stores sessions in-memory, but you can provide your own session manager that stores them using a persistence mechanism such as ZODB or a relational database.
The only interesting attribute of Session
is a .user attribute,
whose value is undefined by Quixote and left up to the application.
Several options:
demo.cgi:
#!/www/python/bin/python # Example driver script for the Quixote demo: # publishes the quixote.demo package. from quixote import Publisher # Create a Publisher instance, giving it the root package name app = Publisher('quixote.demo') # Open the configured log files app.setup_logs() # Enter the publishing main loop app.publish_cgi()
The above code will also handle FastCGI. CGI scripts will run through
publish_cgi()
once and exit; under FastCGI it will loop
and service multiple requests.
Running a server on localhost is really easy:
import os, time from quixote.server import medusa_http if __name__ == '__main__': s = medusa_http.Server('quixote.demo', port=8000) s.run()
This can even be used for writing desktop applications: run a Quixote server locally and use Python's webbrowser.open() module to open a browser pointing at it.
PTL = Python Templating Language
example.ptl:
# To callers, templates behave like regular Python functions def cell [html] (content): '<td>' # Literal expressions are appended to the output content # Expressions are evaluated, too. '</td>' def row [html] (L): # L: list of strings containing cell content '<tr>' for s in L: cell(s) '</tr>\n' def loop (n): # No [html], so this is a regular Python function output = "" for i in range(1, 10): output += row([str(i), i*'a', i*'b']) return output
Templates live in .ptl files, which can be imported. To enable this:
import quixote ; quixote.enable_ptl() # Enable import hook
Templates behave just like Python functions:
>>> import example >>> example.cell('abc') <htmltext '<td>abc</td>'> >>> example.loop() <htmltext '<tr><td>1</td><td>a</td><td>b</td>...</tr>\n'>
In .ptl files, methods can even be PTL files.
System | Syntax |
---|---|
Apache SSI | <!--#include virtual="/script/"--> |
PHP | <?php func()?> |
ASP | <% func() %> |
ZPT | <span tal:replace="content">...</span> |
PTL | def f [html] (): content |
PTL's advantages over other syntaxes:
for
,
if
, while
, exceptions, classes, nested functions.
def no_quote [plain] (arg): '<title>' arg # Converted to string '</title>' def quote [html] (arg): '<title>' arg # Converted to string and HTML-escaped '</title>' >>> no_quote('A history of the < symbol') '<title>A history of the < symbol</title>' >>> quote('A history of the < symbol') <htmltext '<title>A history of the < symbol</title>'>
By using '[html]' instead of '[plain]', string literals are compiled
as htmltext
instances. When combined with regular strings
using a + b
or '%s' % b
,
htmltext
HTML-escapes the regular string.
This mechanism is both a convenience for the application writer and a security feature. Cross-site scripting (XSS) attacks are a class of security hole caused by forgetting to escape HTML tags in untrusted data; you might forget to escape the title of a mail message, for example. An attacker could insert JavaScript that opened pop-up windows or redirected the user to another site.
It's easy to forget the required function call, and forgetting to escape a single snippet is all it takes. PTL's automatic escaping trusts only the string literals supplied in the program text, and it also fails securely. When you mess up, the usual result is double-escaping a string, resulting in web site users seeing '<p>blah blah blah...'. This is embarrassing, but doesn't open up any vulnerabilities.
It should be remembered that while we think PTL is really neat, it's still optional. Using alternative templating isn't hard, and there are Quixote users who never use PTL.
Graham Fawcett wrote a small Nevow implementation.
from nevow import * _q_exports = ['template'] def template(doctitle, docbody): """ A page template. The stylesheet is there as a visual check that class and id attributes are set properly. """ return html [ head [ title[doctitle], style(type='text/css')[ 'body { background-color: lightblue; } ', '.section { border: blue 3px solid; padding: 6px; } ', '#mainbody { background-color: white; } ' ], ], body [ h1 [doctitle], div({'class':'section'}, id='mainbody')[docbody], hr ] ]
Quixote contains a set of classes for implementing forms. Example:
from quixote.form import Form class UserForm (Form): def __init__ (self): Form.__init__(self) user = get_session().user self.add_widget("string", "name", title="Your name", value=user.name) self.add_widget("password", "password1", title="Password", value="") self.add_widget("password", "password2", title="Password, again", value="") self.add_widget("single_select", "vote", title = "Vote on proposal", allowed_values=[None] + range(4), descriptions=['No vote', '+1', '+0', '-0', '-1'], hint = "Your vote on this proposal") self.add_widget("submit_button", "submit", value="Update information")
The basic idea is that you subclass the Form
class
to create a single new form. A form contains a number of widgets.
Widgets represent a form element such as a text field or checkbox, or
multiple form elements; multiple form elements could be used to enter
a date, for example. Widgets can also perform additional checks such
as requiring that a text field contain an integer.
The framework handles processing of a form. The
Form
instance creates widgets in its
__init__
method. The render()
method is
called to generate HTML to display the form. On submitting the form,
the process()
method is called to read the values of
fields and perform any error checking, and if no errors are reported,
the action()
method is called to perform the actual work
of the form (e.g. inserting data into a database, sending an e-mail, etc.).
For a more detailed explanation of the form framework, see part 2 of the Quixote tutorial at http://www.quixote.ca/learn/2.
class UserForm (Form): ... def render [html] (self, request, action_url): standard.header("Edit Your User Information") Form.render(self, request, action_url) standard.footer() def process (self, request): values = Form.process(self, request) if not (values['password1'] == values['password2']): self.error['password1'] = 'The two passwords must match.' return values def action (self, request, submit, values): user = request.session.user user.name = values['name'] if values['password1'] is not None: user.password = values['password1'] return request.response.redirect(request.get_url(1))
This render()
implementation uses the default
rendering of the form, but wraps our own header/footer around
that rendering.
process()
gets the values and performs error checks.
action()
does the work of the form, and can assume the input data is all
correct.
For Quixote-only apps, you often need to return static files such as PNGs, PDFs, etc.
from quixote.util import StaticFile, StaticDirectory _q_exports = ['images', 'report_pdf'] report_pdf = StaticFile('/www/sites/qx/docroot/report.pdf', mime_type='application/pdf' images = StaticDirectory('/www/sites/qx/docroot/images/')
The quixote.util
module also contains helpers
for XML-RPC, for streaming files back to the client, etc.
These classes includes a number of conveniences. If you don't
provide a MIME media type, Python's mimetypes
module will
be used to guess the correct MIME type. Files can optionally be
cached in memory to save on I/O.
StaticDirectory
defaults to security: it doesn't
follow symlinks or allow listing the directory unless you
explicitly enable this.
The canonical bad URL:
http://gandalf.example.com/cgi-bin/catalog.py ?item=9876543&display=complete&tag=nfse_lfde
A better set of URLs:
http://www.example.com/catalog/9876543/complete .../brief .../features
Quixote features such as _q_lookup
make it easy to
support sensible URLs.
Don't mix the basic objects for your problem with the HTML for the user interface.
For each object, represent it by a class and put
the user interface in a *UI
class elsewhere.
Advantages:
Directory organization:
qx/bin/ # Various scripts qx/conference/__init__.py # Marker qx/conference/objects.py # Basic objects: Proposal, Author, Review qx/ui/conference/__init__.py qx/ui/conference/email.ptl # Text of e-mail messages qx/ui/conference/standard.ptl # Header, footer, display_proposal() qx/ui/conference/pages.ptl # Login form, base CSS qx/ui/conference/proposal.ptl # ProposalUI class
Naming conventions for common modules:
application/ui/standard.ptl
--header()
, footer()
,
and any other commonly-used HTML snippets.
application/session.py
--application/pages.ptl
-- PTL pages not tied to an object
These slides: www.amk.ca/talks/quixote
Quixote BoF session: Tonight at 8PM, in Room 1
Quixote home page: www.mems-exchange.org/software/quixote
Quixote resources: