Re: Python portability planning ( and prototypes )

Guido van Rossum (Guido.van.Rossum@cwi.nl)
Thu, 06 Feb 1992 11:47:36 +0100

>But what I'm aiming for is that source code libraries be portable
^^^^^^^^^^^^^^^^^^^^^
>*without* renaming or editing files.

Good point! (I assume you mean what I call the "standard library
modules".)

>[ It is not a big bother right
>now, but it will grow as the libraries grow. I'm trying to nip library
>management in the bud.] So if there IS posix emulation, then either
>POSIXposix, macposix, or dosposix will be loaded. ( Not exactly accurate,
>I know posix & mac are "built-in", but you get the idea. ).

Excellent idea, this is a very nice solution. As an extension, the
built-in posix emulations could export themselves under two names:
'posix' for naive programs, and '<OSNAME>posix' for programs that want
to know. (I suggest using 'unix' for <OSNAME> in the UNIX case.)

>guido> I don't see much use for knowing the CPU type, byte order, word size
>guido> and floating point format unless pack/unpack get implemented. Come
>guido> on, most C code doesn't know or care about these (surely 99% of my own
>guido> code doesn't make any assumptions beyond what ANSI C guarantees), so
>guido> why should Python programs care?
>
>You got it backward: If there is a low level implementation of pack/unpack,
>then we *DON'T* need to know byte-order or work size. If pack/unpack and
>other network & native binary conversions are done in python-source, THEN
>we DO need byte-order and word size. [ I don't really care about CPU type -
>that was only because posix uname ( or arch ) seemed to be the only
>(indirect) way of inferring the byte-order & (possibly) work size. ]

I still don't agree. You don't need the byte order and word size
of the current machine -- you need the byte order and word size of the
machine that created the binary file you are trying to read (or where
the binary file you are going to write will be used). Since you say
you want to do this for conversion jobs, I expect these will often be
different from the current machine!

>Again, I admit: Infrequently needed. But essential for a whole class of
>problems. C network programs use htons(), etc. to convert to/from native
>and network byte orders. C programs are FULL of "sizeof()"'s [ The need
>for which disappears in Python 98+% of the time, but again, that 1 or 2%
>either shuts out a whole class of problems or it forces some one to dig
>into internals ( either of Python or the machine ) to find out the answer
>and HARD-WIRE it into their program - the extreme of non-portability! ]

It's extremely simple to add a command line option, environment
variable or more sophisticated configuration mechanism whereby byte
order and word size can be specified to such conversion programs.
Hard-wiring in the native parameters makes the program non-portable if
the conversion has to run on a different platform. As an example, C++
to C translators usually read a little file specifying the size and
alignment of C data structures on the target machine.

I admit that it's probably useful to have the native parameters
available for defaults in many cases. So they should be available,
and getting these from a built-in interface is more reliable than
using a configuration file. The next version of the posix module will
already contain a uname() call, so it's also a possibility make a
portable library module that translates pairs of (osname, machinetype)
into byte order and int size. (There's no rule in POSIX that says
that different OS's running on the same machine must return the same
string, so the osname must figure in the argument.)

>[ I think that aesthetically, Guido, you want things to be *SO*
> portable that you DON'T NEED to know WHAT machine you are running on,
> ( or what byte-order is native, etc. ) but I'm argueing that the only
> way to MAKE some things have a portable interface is for some level
> to know that information, and modify it's actions accordingly. ]

Agreed. Is what I said above enough water it the wine?

>You might reply that binary file conversion is not a proper job for Python.
>If I want to get down and twiddle bits, I should be using C!

But I won't. Python can do this, and if reading a binary file is a
small part of the conversion process, or if it's a one-time job,
Python is quite suitable (especially if you already know to program in
Python).

>*IF* we need a routine to convert to/from a cannonical ( probably
>posix style ) pathname to/from a local style pathname ( I'm not sure
>that we do, but I think so, unless everyone remembers to only use
>the least common denominator path names, or unless that conversion
>is built into the builtin module's "open" . ) then that module has to
>have some local information!

I am very much against adopting a standard Python style of pathnames
(i.e. saying that '/' is the pathname delimiter on all systems and
translating to the native delimiter on systems where it differs).
The arguments passed to open(), stat(), listdir() etc. and the return
values of getcwd(), readlink() etc. must definitely be pathnames in
the native system's syntax. Since most pathnames will be typed by the
user (e.g. on the command line) or retrieved from the system somehow,
this can't be a big problem. Where you hard-code pathnames in a
program they usually have to be edited when moving the program to a
different site anyway.

--Guido van Rossum, CWI, Amsterdam <guido@cwi.nl>
"I could be arguing in my spare time."