Application Configuration Using ZConfig

Introduction

ZConfig is a Python package supporting application configuration. The basic facilities include a simple API for the application, a configuration file format, and support for configuration schema.

ZConfig performs some of the same duties as the ConfigParser module in Python's standard library, but also provides support for hierarchical sections, data conversion, and input validation. The data conversion and input validation are based on a declarative schema language in XML. Using a schema to describe configuration helps avoid some kinds of application and configuration errors.

Comparison with ConfigParser

ConfigParser provides a simple and easy-to-use API that allows an application to handle any configuration file it can parse (essentially .INI files if we ignore some historical accidents). While this can be convenient, it makes writing the application more tedious because the application is responsible for performing every data conversion and every use of a default value each time it consults the configuration data. While this kind of work easily be pushed off to a utility, that utility needs to be written for every application. Historical circumstances make the ConfigParser API more difficult to use than it should be.

Particular points of difficulty that seem to cause problems for ConfigParser users on a recurring basis include:

ZConfig addresses these points in the following ways:

ZConfig provides the following additional functionality:

ZConfig Configuration Files

The syntax for the configuration files used by ZConfig was designed with a target audience of system administrators operating under great pressure. This means the following goals were dominant:

There are very few redundant characters in the ZConfig syntax, and none that are required for a simple named configuration variable. Those that are present are used to express the hierarhical structure of the configuration file, which changes less often than the specific values for many applications. Since these are very similar what is found in the syntax for other applications, this level of redundancy and familiarity was chosen to ensure readability maintaining terseness for simple value settings.

There are four aspects of the syntax which can be described separately. In all cases, leading and trailing whitespace are ignored, as are blank lines.

Comments

Any line which has a "#" character as the first non-blank character is a comment and is ignored.

Individual configuration settings

A single setting, or key-value pair, is expressed on a single line:

        key value

The key and value are separated by whitespace, and the value may be empty. The characters allowed for the key include alpha-numeric characters, the underscore, period, and hyphen. The first character must be a letter.

Hierarchical sections

Sections may be nested hierarchically to an arbitrary depth. Sections require both "start" and "end" markers in the syntax. Every section has a type described by the schema, and may have a name which may be constrained by the schema.

This is an example section, with "meta" names filled in for each variable component of the syntax:

        <type name>
        </type>

Configuration variables that are part of the section must be placed between these markers:

        <log access>
          # comments can be here
          path /var/log/access.log
        </log>

There is a shortcut syntax for sections which contain no values:

        <type name/>

Directives

The directives to the ZConfig parser are used to support string substitution and resource inclusion::

What is a ZConfig Schema?

In general, a "schema" describes what structure and values are allowed for a set of data. For ZConfig, a schema specifies the allowed configuration keys and sections for a given configuration, and what type conversions must be applied to each component value. The behavior of the data binding machinery can be controlled by specifying attribute names and data types.

ZConfig schema are expressed using XML. The schema language is simple, so should not present a hurdle for developers.

A ZConfig configuration file can be expressed in terms of keys, sections, and section types. Section types may be provided by schema components.

The ZConfig API

Schema and configuration files may be loaded from the local filesystem or from any URL scheme supported by Python's "urllib2" module.

Loading a schema:

        schema = ZConfig.loadSchema(schema_url)

Loading configuration data:

        config, handler = ZConfig.loadConfig(schema, config_url)

Type Conversion Routines

ZConfig uses a type registry to look up datatype conversion routines. A default registry is provided which has conversions for a variety of simple types and can load additional functions if given the Python dotted-name of the conversion function. An alternate registry may be supplied by the application using a lower level of the API.

Type conversion routines may be provided for either simple keys or for sections. For keys, the string value is passed to the conversion function, and the return value is used as the actual value for that key. Default values are converted if they are used.

For sections, an object is passed which makes the values of the contained keys and sections available as attributes; lists of values are supplied for multikeys and multisections.

For multikeys and multisections, each value is converted separately, so the same conversion functions can be used.

The default datatype conversion is no conversion at all (ie. "string"). For sections, the default is to return the value that would be passed to the conversion routine.

There are a number of simple datatype converters "built-in" to the ZConfig schema machinery. Programmers may extend the datatype conversion routines which are available by creating new datatype conversions and referencing them via a Python dotted-name (e.g. "Zope.Startup.datatypes.null_handler"), or by the "shorthand" enabled by a "prefix" attribute of a containing tag:

    <schema prefix="Zope.Startup.datatypes">
      <key name="foo" datatype=".null_handler">
    </schema>

Configuring a Simple Application

Consider an application which has relatively simple configuration requirements, such as a particular usage of the (new-in-Python-2.3) "logging" package. The logging package offers lots of knobs, and users should be able to turn them. In the application, a configuration of a "logger" instance is described, which can be turned on and off, and which can send data to a file or to the Windows event log.

The first step in designing a schema is to determine what an actual configuration file should look like. This is an example of what the file should look like for the simple application:

    logging on

    <logger>
      <file-handler>
        path /home/chrism/var/log.data
      </file-handler>
    </logger>

The top-level "logging" key is a boolean which describes whether logging will be done. The "logger" section which follows describes the "logger" instance. It may have one or more "handler" subsections, each of which describes a particular output channel for log data (limited for purposes of demonstration to a logging.FileHandler or a logging.handlers.NTEventLogHandler instance).

To wire this up to ZConfig using a schema, first create a schema file. The contents of the schema file are as follows:

    <!-- schema outer element, which describes our default datatype
         converter prefix -->

    <schema prefix="ourlogger.datatypes">

      <!-- marker to describe an abstract type -->

      <abstracttype name="loghandler"/>

      <!-- handler sectiontype declarations which reference the
           "loghandler" abstract type  -->

      <sectiontype name="file-handler"
                   datatype=".file_handler"
                   implements="loghandler">
        <key name="path"       required="yes"/>
        <key name="format"     default="------\n%(asctime)s %(message)s"/>
        <key name="dateformat" default="%Y-%m-%dT%H:%M:%S"/>
        <key name="level"      default="info"
                               datatype=".logging_level"/>
      </sectiontype>

      <sectiontype name="nteventlog-handler"
                   datatype=".nteventlog_handler"
                   implements="loghandler">
        <key name="appname"    default="ourapp"/>
        <key name="format"     default="%(message)s"/>
        <key name="dateformat" default="%Y-%m-%dT%H:%M:%S"/>
        <key name="level"      default="info"
                               datatype=".logging_level"/>
      </sectiontype>

      <!-- logger concrete sectiontype declaration -->

      <sectiontype name="logger"
                   datatype=".logger">
         <key name="level"
              datatype=".logging-level"
              default="info"/>
         <multisection name="*"
                       type="loghandler"
                       attribute="handlers"/>
      </sectiontype>

      <!-- our logging key and logger section declaration -->

      <key name="logging"
           datatype="boolean"
           default="on"/>

      <section name="*"
               type="logger"/>

    </schema>

The schema has a outer element which declares a "prefix" attribute. This prefix is used as the default package name in which to look for datatype converters which have a "." as their first character. In this case, the package in which these converters are said to be found is the "ourlogger.datatypes" module. An "ourlogger" package can be created and placed on the PYTHONPATH. Inside it, create a "datatypes" module, which has the following content:

      import logging

      def logger(section):
          logger = logging.getLogger('')
          logger.setLevel(section.level)
          logger.handlers = []
          for handler in section.handlers:
              logger.addHandler(handler)

      def file_handler(section):
          format = section.format
          dateformat = section.dateformat
          level = section.level
          path = section.path
          formatter = logging.Formatter(format, dateformat)
          handler = logging.FileHandler(path)
          handler.setFormatter(formatter)
          handler.setLevel(level)
          return handler

      def nteventlog_handler(section):
          appname = section.appname
          format = section.format
          dateformat = section.dateformat
          level = section.level
          formatter = logging.Formatter(format, dateformat)
          handler = logging.handlers.NTEventLogHandler(appname)
          handler.setFormatter(formatter)
          handler.setLevel(level)
          return handler

      class LogLevelConversion:
          _levels = {
              "fatal": 50,
              "error": 40,
              "warn": 30,
              "info": 20,
              "debug": 10,
              "all": 0,
              }

          def __call__(self, value):
              s = str(value).lower()
              if self._levels.has_key(s):
                  return self._levels[s]
              else:
                  v = int(s)
                  if not (0 <= v <= 50):
                      raise ValueError("log level not in range: " + `v`)
                  return v

      logging_level = LogLevelConversion()

This module defines all of the "non-built-in" datatype conversion functions specified within the schema file ("logger", "file_handler", "nteventlog_handler", and "logging_level"). There is another datatype mentioned in the schema (boolean), but this is a ZConfig built-in converter, so there is no need to write a datatype conversion for it.

Creating the "application glue" which uses this schema and its associated datatype converters does not require much additional work. The application could pass the name of the configuration file to a function like this one, which loads the schema and the configuration file, and then finishes setting up the "logging" module accordingly:

        import ZConfig

        myfile = os.path.abspath(__file__)
        SCHEMA = os.path.join(os.path.dirname(myfile), "schema.xml")

        def configure(config_file_name):
            schema = ZConfig.loadSchema(SCHEMA)
            cfg, nil = ZConfig.loadConfig(schema, config_file_name)
            if not cfg.logging and cfg.logger:
               from logging import getLogger
               logger = getLogger('')
               logger.handlers = []

That's it. The logging environment is configured, given that a proper schema_file_name and config_file_name are passed in. Since the datatype handlers take care of configuring the logger instance with the proper handlers, one only needs to be sure that the "logging" flag is respected and delete any handlers from the logger if they've been added as the result of datatype conversion.

As a result of this work, the application's users can specify a config file that looks like this:

        logging off

and logging will be off.

The users can specify a config file that looks like this:

        logging on

        <logger>
          <file-handler>
            path /home/chrism/var/log.data
          </file-handler>
        </logger>

and logging will be on, with a single logfile written to the path /home/chrism/var/log.data.

The users can specify a config file that looks like this:

        logging on

        <logger>
          <file-handler>
            path /home/chrism/var/log.data
            level info
          </file-handler>
          <nteventlog-handler>
            appname MyApp
            level warn
          </nteventlog-handler>
        </logger>

and logging will be on, with a logfile written including messages of level "info" to the path /home/chrism/var/log.data and messages of the level "warn" to the local NT event log.

The users can specify a config file that looks like this:

        logging on

        <logger>
          <file-handler>
            path /home/chrism/var/log.data
            level info
            format %(message)s
          </file-handler>
        </logger>

and logging will be on, with a logfile written including messages of level "info" to the path /home/chrism/var/log.data with a log format that doesn't include the date or time or any intermediate characters between log records.

Case Study: Using ZConfig to Configure Zope

The Zope application server is a large application, mostly written in Python. Since Zope is more of a framework than an application, it has many "knobs". Most of these knobs can be turned from within the Zope Management Interface, a web-based UI that provides Zope users and developers a mechanism to interact with the objects that comprise their applications.

Some knobs cannot be turned via this UI, particularly those having to do with server configuration and behavior and other system-global settings. Historically, these configuration parameters were tunable via the use of environment variables. Zope 2.6 has 41 individual environment variables that are used to specify runtime configuration parameters.

For Zope 2.7, we have allowed these configuration parameters to be specified within a configuration file using ZConfig. Zope makes heavy use of ZConfig schemas to perform its configuration duties. Additionally, some of ZConfig's design is influenced by the makeup of Zope; however, nothing in ZConfig depends on Zope. ZConfig can be used anywhere that you run Python 2.2.

Simple ZConfig Uses In The Zope Configuration Schema

In Zope, there is a master schema within the Zope.Startup package named zopeschema.xml.

Within this schema file, Zope makes use of "simple" ZConfig keys as global configuration parameters. For example, the Zope schema allows the specification of an instancehome parameter which is the place on the filesystem which comprises the directory structures that makes up a single Zope "instance", and a clienthome parameter, which is the place on the filesystem in which variable data files are stored.

The relevant schema portions that define these simple keys are:

      <key name="instancehome" datatype="existing-directory"
            required="yes">
        <description>
          The top-level directory which contains the "instance" of the
          application server.
        </description>
      </key>

      <key name="clienthome" datatype="existing-directory">
        <description>
          The directory used to store the file-storage used to back the
          ZODB database by default, as well as other files used by the
          Zope application server to control the run-time behavior.
        </description>
        <metadefault>$INSTANCE_HOME/var</metadefault>
      </key>

The datatype existing-directory is a standard ZConfig datatype. It is backed by the piece of code, defined within the ZConfig.datatypes module:

      def existing_directory(v):
          if os.path.isdir(v):
              return v
          raise ValueError, '%s is not an existing directory' % v

As you can see, if the user specifies a directory which doesn't exist at the time of configuration parsing for either the clienthome parameter or the instancehome parameter, the datatype handler will raise a ValueError, preventing configuration from completing.

Zope may run under several "security policy" implementations. One security policy implementation is "Python", the other is "C" (one is implemented in Python, the other in C). The key that this feature relies on is a custom-defined one:

       <key name="security-policy-implementation"
            datatype=".security_policy_implementation"
            default="C"/>

Note however, that unlike the last example, this key also defines a "custom" datatype (as indicated by the dot before the datatype name. In this case, the datatype handler is defined within the Zope.Startup.datatypes module (this is defined within the prefix of the schema definition itself), and it looks like this:

        def security_policy_implementation(value):
            value = value.upper()
            ok = ('PYTHON', 'C')
            if value not in ok:
                raise ValueError, (
                    "security_policy_implementation must be one of %s" % ok
                    )
            return value

We can see that ZConfig is flexible enough to let us define our own simple datatypes for use within our schemas.

The Use of Schema Components in Zope

Zope consists of about 50 or so Python packages, some of which are usable outside the Zope framework. It was decided that the configuration parameters for Zope should reflect the "sum of its parts". This meant that the definition of Zope's possible configuration parameters could not be monolithic, but needed to be distributed across its various packages. Though it was decentralized enough, the "41 environment variable" approach was becoming unworkable because it was very ad-hoc and there was a lack of understanding on the part of sysadmins (who are very used to interpreting configuration files, but not very used to spelunking scattered docs about environment variables) about how to tell Zope to configure itself in some particular way. This was because the documentation was as decentralized and ad-hoc as the code itself. Thus, we decided to make ZConfig schemas extensible enough to allow the inclusion of schema components on a per-package basis from a master schema.

This zopeschema.xml schema imports the type definitions defined within packages included within Zope itself. For example, the master schema XML file includes the type definitions exported by the zLOG package (the legacy Zope logging package), the ZODB package (the Zope Object Database package), and the ZServer package (the package that allows Zope to run network servers):

        <import package="zLOG"/>
        <import package="ZODB"/>
        <import package="ZServer"/>

These three schema statements instruct the Zope master schema to find files named component.xml within the zLOG, ZODB, and ZServer packages respectively and load them into the current schema namespace.

Each schema component defines abstract types and section types for the component which it represents. For example, the ZODB schema component defines abstract types for storage and database sections. Additionally, it defines concrete sectiontypes for different kinds of storages and databases. Here is an elided excerpt from the ZODB schema component.xml which defines an abstract storage type and a concrete sectiontype which can be used to define an actual storage section:

        <component prefix="ZODB.config">
           <abstracttype name="storage"/>
           ... other abstract types defined here ...
           <sectiontype name="filestorage" datatype=".FileStorage"
                        implements="storage">
              <key name="path" required="yes">
                <description>
                  Path name to the main storage file.  The names for
                  supplemental files, including index and lock files, will be
                  computed from this.
                </description>
              </key>
              <key name="create" datatype="boolean" default="false">
                <description>
                  Flag that indicates whether the storage should be truncated
                  if it already exists.
                </description>
              </key>
              <key name="read-only" datatype="boolean" default="false">
                <description>
                  If true, only reads may be executed against the storage. Note
                  that the "pack" operation is not considered a write operation
                  and is still allowed on a read-only filestorage.
                </description>
              </key>
              <key name="quota" datatype="byte-size">
                <description>
                  Maximum allowed size of the storage file.  Operations which
                  would cause the size of the storage to exceed the quota will
                  result in a ZODB.FileStorage.FileStorageQuotaError being
                  raised.
                </description>
              </key>
           </sectiontype>
            .... other concrete types which implement the defined
            abstract types defined here ...
        </component>

The zLOG and ZServer packages provide similar component.xml definitions for the abstract types and section types that comprise their settings.

How Zope Configures Itself

The set of steps that Zope takes to configure itself based on values from a ZConfig configuration file and command-line parameters is straightforward.

Summary

ZConfig is a powerful and general tool that may be used to supply type-checked runtime configuration parameters to most Python applications.

Approximate time: 30 minutes.