Programmer's Python Data - Bits and BigNum |
Written by Mike James | ||||||||||||||||||||||||||||||||||||||||||||||||||||
Monday, 25 March 2024 | ||||||||||||||||||||||||||||||||||||||||||||||||||||
Page 1 of 3 Bits are at the bottom of it all but Python is high level so how do you work with bits in Python? Find out how it all works in this extract from Programmer's Python: Everything is Data. Programmer's Python
|
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
A |
B |
C |
D |
E |
F |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0000 |
0001 |
0010 |
0011 |
0100 |
0101 |
0110 |
0111 |
1000 |
1001 |
1010 |
1011 |
1100 |
1101 |
1110 |
1111 |
If you know binary you will recognize each bit pattern as the binary number that corresponds to the hexadecimal symbol. That is, hex 5 is 0101 which is also 5 in decimal. The hex value E is 1110 which is the binary representation of 14.
You can use hex symbols to specify any bit pattern of any length simply by concatenating symbols. For example, suppose you want to specify the bit pattern 0111100 then starting from the leftmost four bits 1100 we can use the table to work out that this is C. The next four bits are 0011, as we always add zero bits to make the group up to four bits, and this corresponds to 3. So the bit pattern is specified in hex as 0x3C. The leading 0x is the standard way in many languages including Python of specifying a hex literal. This works in both directions, so any set of hex symbols specifies a bit pattern. For example, 0xABCDE specifies:
101010111100110111101111
Converting Binary
Specifying bit patterns makes hex so much easier than using alternatives such as decimal. After all, what is the decimal representation of 0111100? To answer this question you have to do a full bit-by-bit conversion to decimal to get 60. You can’t convert binary to decimal or vice versa so many bits or digits at a time. This only works if the base you are converting to is a power of two, so it works for octal (base eight) and hex (base sixteen).
If you want to use octal literals then start the value with a 0o where the second character is the lower-case letter o. That is, 0o7 is 7 in octal and 0o100 is 100 in octal and 64 in decimal.
Octal has the same property as hex in being used to specify bit patterns, but in this case each octal symbol determines three bits:
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|
000 |
001 |
010 |
011 |
100 |
101 |
110 |
111 |
Octal can be easier to use in some situations, but notice that you need more octal symbols to specify a given bit pattern. Similarly you can specify a binary literal using the prefix 0b. For example, 0b101 is decimal 5.
You might think that the easiest way to specify a bit pattern is to use a binary literal but typing a lot of zeros and ones is very error prone. Hex is a shorter representation and hence easier to type in correctly and easier to verify.
Bit Patterns In Python
In most languages the standard way to work with bit patterns is to use a fixed-size integer. For example, in C a typical variable references a 16- or 32-bit integer and this means every bit pattern you work with is exactly 16 or 32 bits long. This leads, not so much to problems, but to ways of thinking. You tend to think of working with bits at a fixed size and then converting down if fewer bits are needed or putting multiple variables together for more bits. This makes working with bits more difficult until you get used to it.
Python, on the other hand, doesn’t have a fixed-size integer, instead as discussed in Chapter 2 the storage allocated to an integer, referred to as “bignum”, grows as needed to store the value it holds. As a result you can do integer arithmetic in Python without any worries about overflow until you run out of memory to hold all of the bits of the result.
You can also use hex and binary to set up bit patterns using the bignum representation. As a result you can set up and work with bit patterns of any size, limited only by the amount of memory the machine has. If you are familiar with other languages and their use of bit patterns this is very different. If you want to limit the number of bits you are working with you have to do so explicitly, as will be explained later. What all this means is that when you are working with bit patterns you have an unlimited number of bits. For example:
a = 0xAAAAAAAAAAAAAAAAAA print(a) print(hex(a)) print(bin(a))
displays:
3148244321913096809130 0xaaaaaaaaaaaaaaaaaa 0b101010101010101010101010101010101010101010101010101010101010101010101010
As you would expect, 0xA is 1010 which makes it useful for testing alternating bit pattern.
Notice that hex(value) will convert a bignum to a hex string and bin(value) converts it to a binary string. Both return strings and not numeric values, but the strings are formatted correctly as literals. To convert a correctly formatted literal string to an integer you can use the function:
int(string, base)
This converts the string to an integer assuming that it is a representation using the specified base. If you specify zero for the base then the format of the string determines the base. For example:
a = 0xAAAAAAAAAAAAAAAAAA print(a) print(int(hex(a),0)) print(int(bin(a),0))
displays 3148244321913096809130 three times. If the string isn’t a valid literal with prefix 0x, 0o or 0b then you have to specify the base i.e. 16, 8 or 2.