Programmer's Python Data - Bits and BigNum
Written by Mike James   
Monday, 25 March 2024
Article Index
Programmer's Python Data - Bits and BigNum
The Bitwise Operators
NOT

The Bitwise Operators

Python has a number of operators designed to allow you to perform bit manipulation. There are four bitwise operators:

AND

&

OR

|

XOR (exclusive or)

^

NOT

~

As you would expect, the NOT operator has the highest priority.

Notice that there are also corresponding logical operators  and, or and not that only work with Boolean values and not bit patterns.

If you are more familiar with other languages you might well confuse ^ with raise to a power. In Python ** means raise to a power.

The bitwise operators combine bit patterns to produce new bit patterns. The operations always work on pairs of bits one from each bit pattern in an obvious way. The rules for each of the operators are:

a
b
a&b

0

0

0

0

1

0

1

0

0

1

1

1

 

a
b
a|b

0

0

0

0

1

1

1

0

1

1

1

1

 

a
b
a^b

0

0

0

0

1

1

1

0

1

1

1

0

You can see that the operations are well named. The & (AND) operation only gives a 1 if both a and b are one, the | (Inclusive OR) operator gives a 1 if either of a or b are 1 and the ^ (exclusive OR) operator gives a 1 only if one, but not both, of a or b are a 1.

The bitwise operators work with bignum data, which is converted from its internal representation to a bit pattern that is as large as required, combined with another bit pattern and then reconverted back to a bignum. In most cases you can ignore the fact that there is a conversion between representations because the result is just the combining of bit patterns to produce a new one according to the tables given above and with no limit on the number of bits involved.

For example:

a = 0x012345678
b = 0xAAAAAAAAA
print(bin(a&b))
print(bin(a|b))
print(bin(a^b))

displays:

0b10001000000000001000101000
0b101010111010101111101111111011111010
0b101010111000100111101111110011010010

which you can verify by computing the results bit by bit. Notice that the first result doesn’t have the same number of bits displayed as the input bit patterns. This is because all Python bit patterns have an arbitrary number of leading zeros and when printed these are dropped. Most of the time this doesn’t matter and it is the equivalent of not printing leading zeros for a decimal number.

You can think of every bignum bit pattern as having a single zero bit at the start which is extended as needed to provide additional bits to combine with any other bit pattern involved in the expression. This simple idea that a bit pattern has as many leading zeros as are needed gets more complicated when you consider negative numbers and the NOT operator, ~.

Negative Numbers

Now we come to a complication caused by the way bignums are used to represent bit patterns – negative numbers. As long as you restrict your attention to positive integers, or bit patterns specified using positive hex or binary values, there is little need to worry about the effect of negative numbers and you can skip this “advanced” section.

Python’s bignum uses a very simple representation of negative numbers. It simply stores a flag to indicate that the number is negative. You can set this flag by typing a minus sign in front of a literal or use an expression that evaluates to a negative number. You cannot create a negative number using a bit pattern as a literal. To do so you would have to set the most significant bit to a 1, but as an integer can be as big as you like there is no most significant bit. When you write a = b1111, this is taken to mean 01111 and the sign bit is 0, i.e. a positive integer. It helps to think that Python adds one extra “sign” bit to any bit pattern you specify and this is by default set to zero to indicate a positive value.

The only way you can create a negative is to use a minus sign, for example:

a = -0b1111
print(a)
print(bin(a))

which displays -15 and -0b1111. If you are familiar with bit manipulation in other languages this will not be what you expect.

Negative integers are represented in two’s complement form and the bit pattern that represents -15 is not the bit pattern 1111 with a minus sign in front of it. Two’s complement representation is only possible if you specify a maximum number of bits that are going to be used. If the number of bits you choose is n then the two’s complement representation of -x is given by 2n-x. For example, if we are working with no more than five bits then n is 5 and for -x = -15 the two's complement is 25 – 15 = 32 – 15 = 17, which is 0b10001. The first bit is the sign bit and as it is a 1, it indicates that this is a negative number.

Notice that the two's complement representation depends on the number of bits you opt to use. For example, if we increase the number of bits to six then n is 6 and for -x = -15 the two's complement is 26 – 15 = 64 – 15 = 17, which is 0b110001. However, you can see that adding one bit to the representation has simply added a 1 to the start of the bit pattern. This is not an accident and it provides an easy way of working with two's complement when the exact number of bits isn’t determined.

To state it informally, if you are working with n-bit two's complement representation then increasing the number of bits to n+1 is a matter of adding a leading zero for a positive number or adding a leading one for a negative number.

This expresses the idea that all two's complement representations have to have a leading sign bit and if you need to extend the number of bits you simply “extend the sign bit” with a zero for positive numbers or a one for negative numbers.

Now we can understand what is happening when we evaluate a bitwise expression:

  1. The value stored in a bignum always starts with a sign bit followed by the shortest bit pattern that represents the positive value. That is, 15 is represented as 0 1111 and -15 is 1 1111. Notice that the bit pattern always starts with a one bit and the sign bit isn’t part of it.

  2. When a bit pattern is needed to be used as part of a bitwise expression the value is converted to two's complement form with the minimum number of bits plus one sign bit. That is, 15 is converted to 0 1111 and -15 is converted to 1 1000.

  3. The sign bits of any bit patterns in the bitwise expressions are extended as needed to match the largest bit pattern in the expression. For example:

    15| 0xFF → 0 1111 | 0 11111111 →

    0 00001111 | 0 11111111 →

    0 11111111

    -15| 0xFF → 1 1000 | 0 11111111 →
    1 11111000 | 0 11111111 →

    1 11111111

  4. When the bit pattern is stored back into the bignum the sign bit is used to convert the value from two's complement to a signed positive value. For example, the bit patterns in the previous examples only differ in the sign bit, but when they are stored back into a big integer the result is very different:
    15| 0xFF → 0 11111111

    is converted to +0b11111111 which is 255 in decimal whereas

    -15| 0xFF → 1 11111111

    is converted to – 0b1 which is -1 in decimal.

You might think that if you avoid using negative numbers then you can ignore the whole problem. Unfortunately NOT doesn’t make much sense unless you understand the use of two's complement.



Last Updated ( Tuesday, 26 March 2024 )