should left shift test for overflow?

Guido van Rossum (guido@cwi.nl)
Mon, 20 Jan 92 13:54:19 +0100

Hi folks, after the successful (:-) event of dropping '\E', here I am
again with a microscopic nit in the definition of Python on which I'd
like to see your considered opinion.

I am currently in the process of rationalizing Python's shift and mask
operations. Long (arbitrary length) integers will pretend they're
using 2's complement notation with an infinite number of bits (e.g., 1
is ...0001, -1 is ...1111, -2 is ...1110, and so on. This will mean
that any expression involving plain integers that doesn't raise an
exception has the *same* result when one or more of the operands are
replaced by a long integer with the same value.

Currently the only operator for which this isn't the case is left shift.
Bits shifted out of plain (32-bit) integers are dropped mercilessly.
Of course this is done in analogy with C, but it loses the
corrspondence with long integers; e.g., 1000<<25 is -805306368 but
1000L<<25 is 33554432000L. It also loses the correspondence between
left-shift and multiplication by a power of 2.

But the advantage of dropping the bits is that I can write the
following function to read a 32-bit binary integer from a file:

def rdlongb(f): # Read a "long" (32-bit int) in big-endian format
s = f.read(4)
a, b, c, d = ord(s[0]), ord(s[1]), ord(s[2]), ord(s[3])
return a<<24 | b<<16 | c<<8 | d

If we interpret left-shift as multiplication by a power of 2, the
"a<<24" part would cause overflow if the sign bit of the byte was on
(i.e., if a >= 128). To fix this, I would have to add something like
"if a >= 128: a = a-256" before the return statement.

Note that the same problem does not exist for right shifts: dropping
bits there is perfectly analogous with division by a power of two
(e.g., 1000>>25 is 0 in any domain). Also note that I propose to
change integer division of negative numbers such that -1/2 is -1: x/y
is floor(x//y) if x//y means mathematical (loss-less) division, so the
correspondence between (sign-extending) right-shift and division is
perfect. Modulo will be fixed accordingly: -1%2 will be 1.

My question is, what's more important here: the ability to construct a
negative number by left-shifting a positive number, or correspondence
with long integers and multiplication?

(If you're still with me, you may also tell me if you can find a
reason for unsigned shifts. I currently plan to make all shift
operations sign-preserving, for correspondence with
multiplication/division with powers of 2. What are the advantages of
unsigned shifts?)

--Guido van Rossum, CWI, Amsterdam <guido@cwi.nl>
"Programming - the skill (not an art nor a science) of intentional
premature binding. The best programmers are those who can best make
proper decisions too early." --B. A. Creech, 1969