Here are some notes about how NEXTSTEP has approached persistence in
Objective-C.
Their first go at it was called typed streams. I can send out Postscript
documentation of this to anyone who asks. To dump some objects you need
to write a root object out to the typed stream. That object will
implement -read and -write methods which put/get data into or from the
stream.
When it comes to writing objects it can write the object directly (an
intrinsic object) or it can write an object reference (an extrinsic
object). Writing an object reference means that if anybody else writes
that object as an intrinsic object, then I want a reference to it but
don't put it in the typed stream just for me. For instance, the View
object has a superview instance variable (attribute). It writes that out
as an object reference. So if you use a View as a root object when
writing to a typed stream it will not archive its superview (and then it's
superview and so on until you have the whole window archived...) You only
get that View and its subviews. The subviews all keep their references to
their superviews because they have been archived already.
An object can optionally write a version number to the typed stream. When
reading it back that can be used to detect if it is an old version and it
can then read it the old way and convert it to the new representation.
Objective-C has a single inheritence model. Every subclass would call
super.write(typedStream) or super.read(typedStream) when archiving. That
way attributes owned by superclasses were preserved also. I don't know
what you would want to do with a multiple inheritence object. It seems
like you should give them all a chance to write what they need to.
Typed streams have a few problems. The biggest flaw is that the data in
them is strictly tied to the inheritence hierarchy. Try to insert a new
superclass in there somewhere and things would break. It also makes it
impossible to write out foo objects and then try to read them in as bar
objects. Another gripe was that it was an unpublished binary format which
made it impossible to manipulate typed stream files and if you didn't have
all of the classes it needed you were sunk.
Their second attempt was DBKit which has evolved into the Enterprise
Objects Framework. EOF is specifically designed around their own take on
Entity-Relationship modelling of data. It is designed to be used with
relational databases like Sybase or Oracle. It has many levels of
abstraction which makes it quite dense but very flexible. I'm not very
familiar with it so if there seem to be inconsistencies in this
description then it's probably me, not Next.
There is an "Access Layer" which specifies a protocol that the kit works
with so that you can write the glue to any database you want or even
switch databases without having to change any of the top layers. The
Access Layer throws around the tables as dictionaries. There is a layer
above that which specifies how to make objects out of these dictionaries.
To create attributes of an object you can reference tables and do joins
and calculations and whatnot. Above that you've got a really slick user
interface layer which binds object's attributes directly to screen objects
so the programmer doesn't have to write the glue to have a text field
change an objects attribute.
The system lets you do transactions and validation and all that stuff. I
like the concept but I don't know how well it works in practice.
Another thing to consider is tgen. I'm pretty sure that's the name. It's
a Smalltalk tool which lets you build a grammar to translate a file into a
bunch of objects and vice versa. I haven't used it but I like the idea
that the data is independent of any particular object class. The best
part about it is that it is interpreted because the grammar is really a
bunch of objects.