Perhaps one of the most important structures of the Python object system is the structure that defines a new type: the PyTypeObject structure. Type objects can be handled using any of the PyObject_*() or PyType_*() functions, but do not offer much that's interesting to most Python applications. These objects are fundamental to how objects behave, so they are very important to the interpreter itself and to any extension module that implements new types.
Type objects are fairly large compared to most of the standard types. The reason for the size is that each type object stores a large number of values, mostly C function pointers, each of which implements a small part of the type's functionality. The fields of the type object are examined in detail in this section. The fields will be described in the order in which they occur in the structure.
Typedefs: unaryfunc, binaryfunc, ternaryfunc, inquiry, coercion, intargfunc, intintargfunc, intobjargproc, intintobjargproc, objobjargproc, destructor, freefunc, printfunc, getattrfunc, getattrofunc, setattrfunc, setattrofunc, cmpfunc, reprfunc, hashfunc
The structure definition for PyTypeObject can be found in Include/object.h. For convenience of reference, this repeats the definition found there:
typedef struct _typeobject { PyObject_VAR_HEAD char *tp_name; /* For printing, in format "<module>.<name>" */ int tp_basicsize, tp_itemsize; /* For allocation */ /* Methods to implement standard operations */ destructor tp_dealloc; printfunc tp_print; getattrfunc tp_getattr; setattrfunc tp_setattr; cmpfunc tp_compare; reprfunc tp_repr; /* Method suites for standard classes */ PyNumberMethods *tp_as_number; PySequenceMethods *tp_as_sequence; PyMappingMethods *tp_as_mapping; /* More standard operations (here for binary compatibility) */ hashfunc tp_hash; ternaryfunc tp_call; reprfunc tp_str; getattrofunc tp_getattro; setattrofunc tp_setattro; /* Functions to access object as input/output buffer */ PyBufferProcs *tp_as_buffer; /* Flags to define presence of optional/expanded features */ long tp_flags; char *tp_doc; /* Documentation string */ /* Assigned meaning in release 2.0 */ /* call function for all accessible objects */ traverseproc tp_traverse; /* delete references to contained objects */ inquiry tp_clear; /* Assigned meaning in release 2.1 */ /* rich comparisons */ richcmpfunc tp_richcompare; /* weak reference enabler */ long tp_weaklistoffset; /* Added in release 2.2 */ /* Iterators */ getiterfunc tp_iter; iternextfunc tp_iternext; /* Attribute descriptor and subclassing stuff */ struct PyMethodDef *tp_methods; struct PyMemberDef *tp_members; struct PyGetSetDef *tp_getset; struct _typeobject *tp_base; PyObject *tp_dict; descrgetfunc tp_descr_get; descrsetfunc tp_descr_set; long tp_dictoffset; initproc tp_init; allocfunc tp_alloc; newfunc tp_new; freefunc tp_free; /* Low-level free-memory routine */ inquiry tp_is_gc; /* For PyObject_IS_GC */ PyObject *tp_bases; PyObject *tp_mro; /* method resolution order */ PyObject *tp_cache; PyObject *tp_subclasses; PyObject *tp_weaklist; } PyTypeObject;
The type object structure extends the PyVarObject structure. The ob_size field is used for dynamic types (created by type_new(), usually called from a class statement). Note that PyType_Type (the metatype) initializes tp_itemsize, which means that its instances (i.e. type objects) must have the ob_size field.
Py_TRACE_REFS
is
defined. Their initialization to NULL is taken care of by the
PyObject_HEAD_INIT
macro. For statically allocated objects,
these fields always remain NULL. For dynamically allocated
objects, these two fields are used to link the object into a
doubly-linked list of all live objects on the heap. This
could be used for various debugging purposes; currently the only use
is to print the objects that are still alive at the end of a run
when the environment variable PYTHONDUMPREFS is set.
These fields are not inherited by subtypes.
1
by the PyObject_HEAD_INIT
macro. Note that for statically
allocated type objects, the type's instances (objects whose
ob_type points back to the type) do not count as
references. But for dynamically allocated type objects, the
instances do count as references.
This field is not inherited by subtypes.
PyObject_HEAD_INIT
macro,
and its value should normally be &PyType_Type
. However, for
dynamically loadable extension modules that must be usable on
Windows (at least), the compiler complains that this is not a valid
initializer. Therefore, the convention is to pass NULL to the
PyObject_HEAD_INIT
macro and to initialize this field
explicitly at the start of the module's initialization function,
before doing anything else. This is typically done like this:
Foo_Type.ob_type = &PyType_Type;
This should be done before any instances of the type are created.
PyType_Ready() checks if ob_type is NULL, and
if so, initializes it: in Python 2.2, it is set to
&PyType_Type
; in Python 2.2.1 and later it is
initialized to the ob_type field of the base class.
PyType_Ready() will not change this field if it is
non-zero.
In Python 2.2, this field is not inherited by subtypes. In 2.2.1, and in 2.3 and beyond, it is inherited by subtypes.
This field is not inherited by subtypes.
"P.Q.M.T"
.
For dynamically allocated type objects, this should just be the type
name, and the module name explicitly stored in the type dict as the
value for key '__module__'
.
For statically allocated type objects, the tp_name field should contain a dot. Everything before the last dot is made accessible as the __module__ attribute, and everything after the last dot is made accessible as the __name__ attribute.
If no dot is present, the entire tp_name field is made accessible as the __name__ attribute, and the __module__ attribute is undefined (unless explicitly set in the dictionary, as explained above). This means your type will be impossible to pickle.
This field is not inherited by subtypes.
There are two kinds of types: types with fixed-length instances have a zero tp_itemsize field, types with variable-length instances have a non-zero tp_itemsize field. For a type with fixed-length instances, all instances have the same size, given in tp_basicsize.
For a type with variable-length instances, the instances must have
an ob_size field, and the instance size is
tp_basicsize plus N times tp_itemsize, where N is
the ``length'' of the object. The value of N is typically stored in
the instance's ob_size field. There are exceptions: for
example, long ints use a negative ob_size to indicate a
negative number, and N is abs(ob_size)
there. Also,
the presence of an ob_size field in the instance layout
doesn't mean that the instance structure is variable-length (for
example, the structure for the list type has fixed-length instances,
yet those instances have a meaningful ob_size field).
The basic size includes the fields in the instance declared by the macro PyObject_HEAD or PyObject_VAR_HEAD (whichever is used to declare the instance struct) and this in turn includes the _ob_prev and _ob_next fields if they are present. This means that the only correct way to get an initializer for the tp_basicsize is to use the sizeof operator on the struct used to declare the instance layout. The basic size does not include the GC header size (this is new in Python 2.2; in 2.1 and 2.0, the GC header size was included in tp_basicsize).
These fields are inherited separately by subtypes. If the base type has a non-zero tp_itemsize, it is generally not safe to set tp_itemsize to a different non-zero value in a subtype (though this depends on the implementation of the base type).
A note about alignment: if the variable items require a particular
alignment, this should be taken care of by the value of
tp_basicsize. Example: suppose a type implements an array
of double
. tp_itemsize is sizeof(double)
.
It is the programmer's responsibility that tp_basicsize is
a multiple of sizeof(double)
(assuming this is the alignment
requirement for double
).
None
and
Ellipsis
).
The destructor function is called by the Py_DECREF() and Py_XDECREF() macros when the new reference count is zero. At this point, the instance is still in existence, but there are no references to it. The destructor function should free all references which the instance owns, free all memory buffers owned by the instance (using the freeing function corresponding to the allocation function used to allocate the buffer), and finally (as its last action) call the type's tp_free function. If the type is not subtypable (doesn't have the Py_TPFLAGS_BASETYPE flag bit set), it is permissible to call the object deallocator directly instead of via tp_free. The object deallocator should be the one used to allocate the instance; this is normally PyObject_Del() if the instance was allocated using PyObject_New() or PyOject_VarNew(), or PyObject_GC_Del() if the instance was allocated using PyObject_GC_New() or PyObject_GC_VarNew().
This field is inherited by subtypes.
The print function is only called when the instance is printed to a real file; when it is printed to a pseudo-file (like a StringIO instance), the instance's tp_repr or tp_str function is called to convert it to a string. These are also called when the type's tp_print field is NULL. A type should never implement tp_print in a way that produces different output than tp_repr or tp_str would.
The print function is called with the same signature as
PyObject_Print(): int tp_print(PyObject *self, FILE
*file, int flags)
. The self argument is the instance to be
printed. The file argument is the stdio file to which it is
to be printed. The flags argument is composed of flag bits.
The only flag bit currently defined is Py_PRINT_RAW.
When the Py_PRINT_RAW flag bit is set, the instance
should be printed the same way as tp_str would format it;
when the Py_PRINT_RAW flag bit is clear, the instance
should be printed the same was as tp_repr would format it.
It should return -1
and set an exception condition when an
error occurred during the comparison.
It is possible that the tp_print field will be deprecated. In any case, it is recommended not to define tp_print, but instead to rely on tp_repr and tp_str for printing.
This field is inherited by subtypes.
This field is deprecated. When it is defined, it should point to a function that acts the same as the tp_getattro function, but taking a C string instead of a Python string object to give the attribute name. The signature is the same as for PyObject_GetAttrString().
This field is inherited by subtypes together with tp_getattro: a subtype inherits both tp_getattr and tp_getattro from its base type when the subtype's tp_getattr and tp_getattro are both NULL.
This field is deprecated. When it is defined, it should point to a function that acts the same as the tp_setattro function, but taking a C string instead of a Python string object to give the attribute name. The signature is the same as for PyObject_SetAttrString().
This field is inherited by subtypes together with tp_setattro: a subtype inherits both tp_setattr and tp_setattro from its base type when the subtype's tp_setattr and tp_setattro are both NULL.
The signature is the same as for PyObject_Compare().
The function should return 1
if self greater than
other, 0
if self is equal to other, and
-1
if self less than other. It should return
-1
and set an exception condition when an error occurred
during the comparison.
This field is inherited by subtypes together with tp_richcompare and tp_hash: a subtypes inherits all three of tp_compare, tp_richcompare, and tp_hash when the subtype's tp_compare, tp_richcompare, and tp_hash are all NULL.
The signature is the same as for PyObject_Repr(); it must return a string or a Unicode object. Ideally, this function should return a string that, when passed to eval(), given a suitable environment, returns an object with the same value. If this is not feasible, it should return a string starting with "<" and ending with ">" from which both the type and the value of the object can be deduced.
When this field is not set, a string of the form "<%s object
at %p>" is returned, where %s
is replaced by the type name,
and %p
by the object's memory address.
This field is inherited by subtypes.
PyNumberMethods *tp_as_number;
XXX
PySequenceMethods *tp_as_sequence;
XXX
PyMappingMethods *tp_as_mapping;
XXX
The signature is the same as for PyObject_Hash(); it
must return a C long. The value -1
should not be returned as
a normal return value; when an error occurs during the computation
of the hash value, the function should set an exception and return
-1
.
When this field is not set, two possibilities exist: if the tp_compare and tp_richcompare fields are both NULL, a default hash value based on the object's address is returned; otherwise, a TypeError is raised.
This field is inherited by subtypes together with tp_richcompare and tp_compare: a subtypes inherits all three of tp_compare, tp_richcompare, and tp_hash, when the subtype's tp_compare, tp_richcompare and tp_hash are all NULL.
This field is inherited by subtypes.
The signature is the same as for PyObject_Str(); it must return a string or a Unicode object. This function should return a ``friendly'' string representation of the object, as this is the representation that will be used by the print statement.
When this field is not set, PyObject_Repr() is called to return a string representation.
This field is inherited by subtypes.
The signature is the same as for PyObject_GetAttr(). It is usually convenient to set this field to PyObject_GenericGetAttr(), which implements the normal way of looking for object attributes.
This field is inherited by subtypes together with tp_getattr: a subtype inherits both tp_getattr and tp_getattro from its base type when the subtype's tp_getattr and tp_getattro are both NULL.
The signature is the same as for PyObject_SetAttr(). It is usually convenient to set this field to PyObject_GenericSetAttr(), which implements the normal way of setting object attributes.
This field is inherited by subtypes together with tp_setattr: a subtype inherits both tp_setattr and tp_setattro from its base type when the subtype's tp_setattr and tp_setattro are both NULL.
The tp_as_buffer field is not inherited, but the contained fields are inherited individually.
Inheritance of this field is complicated. Most flag bits are inherited individually, i.e. if the base type has a flag bit set, the subtype inherits this flag bit. The flag bits that pertain to extension structures are strictly inherited if the extension structure is inherited, i.e. the base type's value of the flag bit is copied into the subtype together with a pointer to the extension structure. The Py_TPFLAGS_HAVE_GC flag bit is inherited together with the tp_traverse and tp_clear fields, i.e. if the Py_TPFLAGS_HAVE_GC flag bit is clear in the subtype and the tp_traverse and tp_clear fields in the subtype exist (as indicated by the Py_TPFLAGS_HAVE_RICHCOMPARE flag bit) and have NULL values.
The following bit masks are currently defined; these can be or-ed
together using the |
operator to form the value of the
tp_flags field. The macro PyType_HasFeature()
takes a type and a flags value, tp and f, and checks
whether tp->tp_flags & f
is non-zero.
This field is not inherited by subtypes.
The following three fields only exist if the Py_TPFLAGS_HAVE_RICHCOMPARE flag bit is set.
This field is inherited by subtypes together with tp_clear and the Py_TPFLAGS_HAVE_GC flag bit: the flag bit, tp_traverse, and tp_clear are all inherited from the base type if they are all zero in the subtype and the subtype has the Py_TPFLAGS_HAVE_RICHCOMPARE flag bit set.
This field is inherited by subtypes together with tp_clear and the Py_TPFLAGS_HAVE_GC flag bit: the flag bit, tp_traverse, and tp_clear are all inherited from the base type if they are all zero in the subtype and the subtype has the Py_TPFLAGS_HAVE_RICHCOMPARE flag bit set.
The signature is the same as for PyObject_RichCompare().
The function should return 1
if the requested comparison
returns true, 0
if it returns false. It should return
-1
and set an exception condition when an error occurred
during the comparison.
This field is inherited by subtypes together with tp_compare and tp_hash: a subtype inherits all three of tp_compare, tp_richcompare, and tp_hash, when the subtype's tp_compare, tp_richcompare, and tp_hash are all NULL.
The following constants are defined to be used as the third argument for tp_richcompare and for PyObject_RichCompare():
Constant | Comparison |
---|---|
Py_LT | < |
Py_LE | <= |
Py_EQ | == |
Py_NE | != |
Py_GT | > |
Py_GE | >= |
The next field only exists if the Py_TPFLAGS_HAVE_WEAKREFS flag bit is set.
Do not confuse this field with tp_weaklist; that is the list head for weak references to the type object itself.
This field is inherited by subtypes, but see the rules listed below. A subtype may override this offset; this means that the subtype uses a different weak reference list head than the base type. Since the list head is always found via tp_weaklistoffset, this should not be a problem.
When a type defined by a class statement has no __slots__ declaration, and none of its base types are weakly referenceable, the type is made weakly referenceable by adding a weak reference list head slot to the instance layout and setting the tp_weaklistoffset of that slot's offset.
When a type's __slots__ declaration contains a slot named __weakref__, that slot becomes the weak reference list head for instances of the type, and the slot's offset is stored in the type's tp_weaklistoffset.
When a type's __slots__ declaration does not contain a slot named __weakref__, the type inherits its tp_weaklistoffset from its base type.
The next two fields only exist if the Py_TPFLAGS_HAVE_CLASS flag bit is set.
This function has the same signature as PyObject_GetIter().
This field is inherited by subtypes.
Iterator types should also define the tp_iter function, and that function should return the iterator instance itself (not a new iterator instance).
This function has the same signature as PyIter_Next().
This field is inherited by subtypes.
The next fields, up to and including tp_weaklist, only exist if the Py_TPFLAGS_HAVE_CLASS flag bit is set.
For each entry in the array, an entry is added to the type's dictionary (see tp_dict below) containing a method descriptor.
This field is not inherited by subtypes (methods are inherited through a different mechanism).
For each entry in the array, an entry is added to the type's dictionary (see tp_dict below) containing a member descriptor.
This field is not inherited by subtypes (members are inherited through a different mechanism).
For each entry in the array, an entry is added to the type's dictionary (see tp_dict below) containing a getset descriptor.
This field is not inherited by subtypes (computed attributes are inherited through a different mechanism).
Docs for PyGetSetDef (XXX belong elsewhere):
typedef PyObject *(*getter)(PyObject *, void *); typedef int (*setter)(PyObject *, PyObject *, void *); typedef struct PyGetSetDef { char *name; /* attribute name */ getter get; /* C function to get the attribute */ setter set; /* C function to set the attribute */ char *doc; /* optional doc string */ void *closure; /* optional additional data for getter and setter */ } PyGetSetDef;
This field is not inherited by subtypes (obviously), but it defaults
to &PyBaseObject_Type
(which to Python programmers is known
as the type object).
This field should normally be initialized to NULL before PyType_Ready is called; it may also be initialized to a dictionary containing initial attributes for the type. Once PyType_Ready() has initialized the type, extra attributes for the type may be added to this dictionary only if they don't correspond to overloaded operations (like __add__()).
This field is not inherited by subtypes (though the attributes defined in here are inherited through a different mechanism).
The function signature is
PyObject * tp_descr_get(PyObject *self, PyObject *obj, PyObject *type);
XXX blah, blah.
This field is inherited by subtypes.
The function signature is
int tp_descr_set(PyObject *self, PyObject *obj, PyObject *value);
This field is inherited by subtypes.
XXX blah, blah.
Do not confuse this field with tp_dict; that is the dictionary for attributes of the type object itself.
If the value of this field is greater than zero, it specifies the
offset from the start of the instance structure. If the value is
less than zero, it specifies the offset from the end of the
instance structure. A negative offset is more expensive to use, and
should only be used when the instance structure contains a
variable-length part. This is used for example to add an instance
variable dictionary to subtypes of str or tuple.
Note that the tp_basicsize field should account for the
dictionary added to the end in that case, even though the dictionary
is not included in the basic object layout. On a system with a
pointer size of 4 bytes, tp_dictoffset should be set to
-4
to indicate that the dictionary is at the very end of the
structure.
The real dictionary offset in an instance can be computed from a negative tp_dictoffset as follows:
dictoffset = tp_basicsize + abs(ob_size)*tp_itemsize + tp_dictoffset if dictoffset is not aligned on sizeof(void*): round up to sizeof(void*)
where tp_basicsize, tp_itemsize and tp_dictoffset are taken from the type object, and ob_size is taken from the instance. The absolute value is taken because long ints use the sign of ob_size to store the sign of the number. (There's never a need to do this calculation yourself; it is done for you by _PyObject_GetDictPtr().)
This field is inherited by subtypes, but see the rules listed below. A subtype may override this offset; this means that the subtype instances store the dictionary at a difference offset than the base type. Since the dictionary is always found via tp_dictoffset, this should not be a problem.
When a type defined by a class statement has no __slots__ declaration, and none of its base types has an instance variable dictionary, a dictionary slot is added to the instance layout and the tp_dictoffset is set to that slot's offset.
When a type defined by a class statement has a __slots__ declaration, the type inherits its tp_dictoffset from its base type.
(Adding a slot named __dict__ to the __slots__ declaration does not have the expected effect, it just causes confusion. Maybe this should be added as a feature just like __weakref__ though.)
This function corresponds to the __init__() method of classes. Like __init__(), it is possible to create an instance without calling __init__(), and it is possible to reinitialize an instance by calling its __init__() method again.
The function signature is
int tp_init(PyObject *self, PyObject *args, PyObject *kwds)
The self argument is the instance to be initialized; the args and kwds arguments represent positional and keyword arguments of the call to __init__().
The tp_init function, if not NULL, is called when an instance is created normally by calling its type, after the type's tp_new function has returned an instance of the type. If the tp_new function returns an instance of some other type that is not a subtype of the original type, no tp_init function is called; if tp_new returns an instance of a subtype of the original type, the subtype's tp_init is called. (VERSION NOTE: described here is what is implemented in Python 2.2.1 and later. In Python 2.2, the tp_init of the type of the object returned by tp_new was always called, if not NULL.)
This field is inherited by subtypes.
The function signature is
PyObject *tp_alloc(PyTypeObject *self, int nitems)
The purpose of this function is to separate memory allocation from
memory initialization. It should return a pointer to a block of
memory of adequate length for the instance, suitably aligned, and
initialized to zeros, but with ob_refcnt set to 1
and ob_type set to the type argument. If the type's
tp_itemsize is non-zero, the object's ob_size field
should be initialized to nitems and the length of the
allocated memory block should be tp_basicsize +
nitems*tp_itemsize
, rounded up to a multiple of
sizeof(void*)
; otherwise, nitems is not used and the
length of the block should be tp_basicsize.
Do not use this function to do any other instance initialization, not even to allocate additional memory; that should be done by tp_new.
This field is inherited by static subtypes, but not by dynamic subtypes (subtypes created by a class statement); in the latter, this field is always set to PyType_GenericAlloc(), to force a standard heap allocation strategy. That is also the recommended value for statically defined types.
If this function is NULL for a particular type, that type cannot be called to create new instances; presumably there is some other way to create instances, like a factory function.
The function signature is
PyObject *tp_new(PyTypeObject *subtype, PyObject *args, PyObject *kwds)
The subtype argument is the type of the object being created; the args and kwds arguments represent positional and keyword arguments of the call to the type. Note that subtype doesn't have to equal the type whose tp_new function is called; it may be a subtype of that type (but not an unrelated type).
The tp_new function should call
subtype->tp_alloc(subtype, nitems)
to
allocate space for the object, and then do only as much further
initialization as is absolutely necessary. Initialization that can
safely be ignored or repeated should be placed in the
tp_init handler. A good rule of thumb is that for
immutable types, all initialization should take place in
tp_new, while for mutable types, most initialization should
be deferred to tp_init.
This field is inherited by subtypes, except it is not inherited by
static types whose tp_base is NULL or
&PyBaseObject_Type
. The latter exception is a precaution so
that old extension types don't become callable simply by being
linked with Python 2.2.
The signature of this function has changed slightly: in Python 2.2 and 2.2.1, its signature is destructor:
void tp_free(PyObject *)
In Python 2.3 and beyond, its signature is freefunc:
void tp_free(void *)
The only initializer that is compatible with both versions is
_PyObject_Del
, whose definition has suitably adapted in
Python 2.3.
This field is inherited by static subtypes, but not by dynamic subtypes (subtypes created by a class statement); in the latter, this field is set to a deallocator suitable to match PyType_GenericAlloc() and the value of the Py_TPFLAGS_HAVE_GC flag bit.
The garbage collector needs to know whether a particular object is
collectible or not. Normally, it is sufficient to look at the
object's type's tp_flags field, and check the
Py_TPFLAGS_HAVE_GC flag bit. But some types have a
mixture of statically and dynamically allocated instances, and the
statically allocated instances are not collectible. Such types
should define this function; it should return 1
for a
collectible instance, and 0
for a non-collectible instance.
The signature is
int tp_is_gc(PyObject *self)
(The only example of this are types themselves. The metatype, PyType_Type, defines this function to distinguish between statically and dynamically allocated types.)
This field is inherited by subtypes. (VERSION NOTE: in Python 2.2, it was not inherited. It is inherited in 2.2.1 and later versions.)
This is set for types created by a class statement. It should be NULL for statically defined types.
This field is not inherited.
This field is not inherited; it is calculated fresh by PyType_Ready().
The remaining fields are only defined if the feature test macro COUNT_ALLOCS is defined, and are for internal use only. They are documented here for completeness. None of these fields are inherited by subtypes.
Also, note that, in a garbage collected Python, tp_dealloc may be called from any Python thread, not just the thread which created the object (if the object becomes part of a refcount cycle, that cycle might be collected by a garbage collection on any thread). This is not a problem for Python API calls, since the thread on which tp_dealloc is called will own the Global Interpreter Lock (GIL). However, if the object being destroyed in turn destroys objects from some other C or C++ library, care should be taken to ensure that destroying those objects on the thread which called tp_dealloc will not violate any assumptions of the library.
See About this document... for information on suggesting changes.