Translucent Inter-Process Service Migration

Jean-Paul Calderone
exarkun@divmod.com

Abstract

What Is Migration?

Service migration is a term coined for the process involving a hand-off of application-level duties from one process to another. Historically this has typically been achieved by stopping one process and then starting another. In the case of certain platforms, wider reaching restart procedures (such as system reboots) have been required, but ultimately the effect on the user has been the same: application state must be saved or lost, a period of absence of service must be tolerated, and finally application state must be reloaded or recreated.

Examples of service migration are widespread and exist anywhere software is in use. From a desktop user loading a new version of a word processing tool, to a system administrator upgrading a company's web server, to a phone company rolling over the software that drives millions of telephone calls every day; the fundamental requirement of using new software to process old data is the same in each case even though the risk, difficulty, and consequences differ greatly.

Why Translucent Migration?

Circumstances exist where it is desirable to change the behavior of a running application without interrupting the service of those clients who are using it at the time. In high load environments, it is often the case that there will always be a significant number of users relying on the service at all times. In these environments, the simplistic approach of waiting for the service to become quiescent, then shutting down and restarting the server software is not feasible.

Translucent migration removes the most user-visible portion of the migration process: absence of service. Instead of starting a new process after the old process has been shut down, the order is reversed causing the new and old processes to exist at the same time. During this overlap, required state can be moved between the two processes while one of the processes continues to serve user requests. Expensive operations, such as opening a database, establishing connections to remote hosts, or building in-memory caches can all be performed by the new process while the old is still responding to user requests, minimizing the time during which the application is non-responsive.

Application Requirements

There are several requirements made of software which is to participate in this migration scheme. The requirements are minimally invasive to the application itself, but necessary for services to be migrated to a new process. It is for these reasons that the process of migration is referred to as translucent.

Migration Server

The first requirement is that a migration server be made to listen on a socket providing virtual circuit capabilities. In the demonstration, a stream oriented UNIX socket (AF_UNIX, SOCK_STREAM) is used, but a TCP/IP socket (AF_INET, SOCK_STREAM) would work equally well. When migration is to occur, the process assuming responsibility for services (henceforth referred to as the service recipient) will connect to this server to initiate the migration procedure. The server must be made aware of all groupings of objects which represent services (henceforth referred to as service object graphs) to be made available for migration. Each of these service object graphs will be transferred to the new process separately from the others, though typically they will all be transferred in rapid succession.

Serializability

The second requirement is that every service object graph must be serializable [1]. This is necessary so that they can be migrated to another process. Most objects will be serialized as a simple byte stream and sent over a stream-oriented socket, but some may be serialized in more exotic manners. Serializers for instances of most built-in types, including ints, strings, lists, and dicts, are provided by framework code. Additionally, instances of most user-defined classes can also be serialized by using the fully qualified name of their class and the contents of their instance __dict__. Since the focus of this migration technique is internet servers, a serializer for file-like objects which have been wrapped in a twisted.internet.abstract.FileDescriptor object is provided as well. This allows sockets to be migrated without special consideration from application-level code.

Migration Client

The final requirement is placed upon the service recipient rather than the service originator. It must be capable of issuing requests to a service originator to take responsibility for a service. As previously mentioned, responsibility for a service is assumed by connecting to a server running in the process which is to give up the service in question. At this point, a sequence of events which ultimately result in the state of one or more services being transferred to a new process occurs.

Authentication

First, if desired, the client can be required to authenticate with the server, proving its identity. While UNIX sockets already provide some level of security with respect to who may connect to the migration server, at some times it may be desirable to pass services to a process running as a different user, a process which would be otherwise unable to access a socket restricted by filesystem permissions, or a process for which filesystem permissions alone are too coarsely grained to prevent access.

Secondary Transports

Second, secondary transports are established, providing a path for objects which cannot be serialized solely as a byte stream to be transported over. There is one example of this in the current implementation: a second UNIX socket is opened and used to pass file descriptors (via sendmsg(2) and recvmsg(2)) to the process assuming responsibility. Passing such things as POSIX capabilities (via the same mechanism) is another use-case for UNIX sockets as secondary transports.

Service Object Graph Migration

Next, the actual migration occurs. The client selects a service from among those the server publishes as available for migration. The server serializes the service's objects and seconds them over whatever transports are appropriate. While the service is in transport, it is important that parts of it not be allowed to alter state in such a way as to invalidate the object graph received by the client. In a network server, this typically means that the server must cease socket related activities, such as receiving new data from the kernel and attempting to send buffered output data, as well as not invoking any methods on objects belonging to the service. It also marks the service as in-transit so that it cannot be requests by a second migration client while it is still being sent to the first. Simultaneously, the client is deserializing bytes it receives and gradually building up an object graph representing the service it has requested. Deserialization is straightforward, with the added requirement that information from more than one transport may need to be combined to produce the correct objects. If there is an error while migrating the service, the server can mark the service as no longer in-transit and resume normal processing on it, resulting in no user-visible consequences of the failure. If the migration is successful, then the client has assumed full responsibility for the service and the server simply discards all objects related to that service.

Shutdown

Once all services have been migrated out of a process, the process may terminate normally.

Serializers

Abstract File Descriptors

As previously mentioned, twisted.spread.jelly is used to communicate objects between processes. This implementation uses interfaces, adapters, and components to maintain a separation of concerns between application logic and serialization logic. Annotated versions of the serializer and deserializer for twisted.internet.abstract.FileDescriptor are given below:

from twisted.spread import interfaces as ispread
from twisted.python import components

# Serialization

# When the jellier invokes ispread.IJellyable(obj), and obj is an instance
# of FileDescriptor, an instance of this class will be returned.
class FileDescriptorJellier(components.Adapter):
    __implements__ = (ispread.IJellyable,)

    # This is the main external entry point into the file descriptor
    # serialization code.  The jellier will call .jellyFor(self) on the
    # object returned by ispread.IJellyable(obj).  The object returned is an
    # s-expression representing the state of this adapted FileDescriptor
    # object.
    def jellyFor(self, jellier):
        # This is where we indicate to Twisted's reactor that this socket
        # should no longer be polled for input nor have its output buffer
        # flushed.  This ensures that state remains consistent between the
        # times when serialization begins and ends.
        self.original.stopReading()
        self.original.stopWriting()

        # The next three lines set up failure handlers, so that if there
        # is an error transferring the service this socket is part of to
        # the client, normal processing can be resumed on the server.
        a = jellier.invoker.serializingPerspective
        j = IJanitor(a)
        j.track(a.sendingServerID,
                lambda: self.original.socket.close(),
                lambda: (self.original.startReading(), self.original.startWriting()))

        # Here is where the object state is actually turned into an
        # s-expression.
        sxp = jellier.prepare(self.original)
        sxp.extend([
            reflect.qual(self.original.__class__),
            jellier.jelly(self.getStateFor(jellier))])
        return jellier.preserve(self.original, sxp)

    # This function computes the actual instance state of the
    # FileDescriptor.
    def getStateFor(self, jellier):
        state = self.original.__dict__.copy()
        ISocketStorage(jellier).put(state.pop('socket'))

        # dChannel is the secondary transport used for transmitting file
        # descriptors.  The file descriptor is sent now, then the related
        # items of state are removed from the state dictionary.  They will
        # be re-associated with the FileDescriptor which is constructed in
        # the client process.
        jellier.invoker.serializingPerspective.dChannel.transport.sendFileDescriptors([state['fileno']()])
        del state['fileno']
        return state

# Deserialization

# Unjelliers are simpler.  They are nothing more than callables which are
# passed an unjellier object and an s-expression and which return an object.

# FileDescriptorUnjellier can work on both server ports and connected
# sockets.  The mode argument specifies which.  For server ports, the mode
# is READ.  For other sockets, it is READ | WRITE.
READ = 1
WRITE = 2

class FileDescriptorUnjellier:
    def __init__(self, mode):
        self.mode = mode

    def __call__(self, unjellier, jellyList):
        # Unjellier.invoker.fdproto is the client's handle on the secondary
        # transport used to transfer file descriptors.
        fdproto = unjellier.invoker.fdproto

        # This is a common deserialization technique.
        klass = reflect.namedAny(jellyList[0])
        inst = _DummyClass()
        inst.__class__ = klass
        state = unjellier.unjelly(jellyList[1])
        inst.__dict__ = state

        # Here we determine what kind of socket has been transferred.
        addressFamily = getattr(klass, 'addressFamily', socket.AF_INET)
        socketType = getattr(klass, 'socketType', socket.SOCK_STREAM)

        # Finally, we create a socket from the file descriptor and associate
        # it with the FileDescriptor that has been instantiated.

        # fdproto.fds.pop() gives the appropriate file descriptor because
        # unjellying happens in the reverse order as jellying.  For
        # serialization schemes where this does not hold, it is a simple
        # matter to include an identifier which allows the right file
        # descriptor to be found.
        skt = socket.fromfd(fdproto.fds.pop(), addressFamily, socketType)
        socketInMyPocket(skt, inst, 'socket', self.mode)
        return inst

def socketInMyPocket(skt, instance, attribute, mode):
    setattr(instance, attribute, skt)
    instance.fileno = skt.fileno
    if mode & READ:
        instance.startReading()
    if mode & WRITE:
        instance.startWriting()

Real File Descriptors

Migrating abstract file descriptors is only of use if the real file descriptors they represent can be migrated as well. A reactor-based API based on sendmsg(2) and recvmsg(2) is used by this migration implementation, and part of its implementation is presented below:


from twisted.internet import unix

from sendmsg import sendmsg, recvmsg
from sendmsg import SCM_RIGHTS

class Server(unix.Server):
    def sendFileDescriptors(self, fileno, data="Filler"):
        """
        @param fileno: An iterable of the file descriptors to pass.
        """
        payload = struct.pack("%di" % len(fileno), *fileno)
        return sendmsg(self.fileno(), data, 0, (socket.SOL_SOCKET, SCM_RIGHTS, payload))

class Client(unix.Client):
    def doRead(self):
        while self.connected:
            try:
                msg, flags, ancillary = recvmsg(self.fileno())
            except socket.error, e:
                if e[0] == errno.EAGAIN:
                    break
                else:
                    log.err()
            except:
                log.err()
            else:
                if ancillary:
                    buf = ancillary[0][2]
                    fds = struct.unpack('%di' % (len(buf) / 4), buf)
                    try:
                        self.protocol.fileDescriptorsReceived(fds)
                    except:
                        log.err()
                else:
                    break
        return unix.Client.doRead(self)

FileDescriptorJellier and FileDescriptorUnjellier rely on these classes to ensure the correct real file descriptor is associated with each abstract file descriptor which is migrated to a new process.

Results

Interactive Python Shell / Telnet Server

To demonstrate this technique in a complex and stateful environment, a migration server was added to a telnet server connected to an interactive Python shell. The telnet portion of the server uses classes which have been part of Twisted for nearly two years as of the writing of this paper, and only makes a few lines of modifications to allow migration to be possible.

The first portion of a transcript of a brief telnet session with this server is as follows:

exarkun@boson:~$ telnet localhost 8000
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

telnet.ShellFactory
Twisted 1.2.1alpha1
username: admin
password: *****
>>> import sys, os
>>> sys.version
'2.2.3+ (#1, Feb 25 2004, 23:29:31) \n[GCC 3.3.3 (Debian)]'
>>> os.getpid()
880

At this point, a migration client is started and instructed to retrieve both the telnet server and all connected clients from the service originator. The remainder of the transcript results from the migration client running the telnet server:

>>> os.getpid()
884
>>> try:
...     os.kill(880, 0)
... except OSError, e:
...     print e
... else:
...     print 'No exception occurred'
...
[Errno 3] No such process
>>> sys.version
'2.3.3 (#2, Jan 13 2004, 00:47:05) \n[GCC 3.3.3 20040110 (prerelease) (Debian)]'
>>> ^]
telnet> c
Connection closed.

The PID change indicates that a new process is handling the session and the No such process error from os.kill indicates that the originating process no longer exists. Additionally, the presence of the os and sys modules in the interpreter namespace indicate that application state as well as socket state has successfully migrated to the service recipient. Finally, the new value of sys.version indicates that an entirely different version of the Python interpreter is now handling this telnet session.

Flexibility and Security Concerns

It is significant to note that services can be transferred not only between two versions of the same application, but also between otherwise unrelated applications, so long as all the necessary deserializers are present in the receiving application. In the above demonstration, all of the application logic was actually carried along with the service object graph. The service recipient used was extremely generic and had no code relating to telnet or interactive Python shells until it received the service and deserialized the ShellFactory it contained. This should further underscore the need for authentication before migrating services: this process relies on the client deserializing an arbitrary object graph, one which a malicious server could manipulate to gain control over the client.

Conclusions

Upgrading software in a production environment can be a daunting prospect; even in the best case, when software can be upgraded and restarted without incident, customers are always unhappy with service outages. Translucent service migration can provide a way to test software updates on a live system before enacting them as well as a way to minimize user-visible system downtime.

Translucent migration can also provide a means of a very rapid test/edit/test cycle. While the author strongly believes that unit tests are a sounder overall approach to writing robust, correct software, he recognizes that one of Python's many strengths is the ability of a programmer to get rapid feedback to code changes; the test/edit/test cycle is sometimes very useful in this area. Similar to the reload built-in, migration can provide a way to move changes into a running application very easily. Unlike reload, though, migration provides a very robust error handling capability. Changes which would cause an entirely application to abort abnormally if loaded with reload have absolutely no effect on a migration, relieving the programmer of the need to recover from a mistake by restarting an application.

Footnotes

  1. The initial implementation of this migration technique uses twisted.spread.jelly as the serialization library, however it is not restricted to this library alone; the pickle module provides all the necessary hooks for implementing the same functionality.<-