Programmer's Python Data - Byte Manipulation |
Written by Mike James | |||
Monday, 05 June 2023 | |||
Page 2 of 2
Multibyte ShiftsWhen you have byte sequences to work with it is tempting to simply write a for loop that processes each byte in turn. However, notice that there is no easy way to implement a shift operation on a bytes or bytearray object as you need to arrange to move bits from one byte to another. On the other hand, implementing a shift on a bignum is a single operation. For example: myBytes=bytes([0xFF,0xAA,0x55]) bits=int.from_bytes(myBytes,byteorder="big") bits=bits>>4 print(bits.to_bytes(3,byteorder="big")) displays: b'\x0f\xfa\xa5' which, as you can see, has shifted the low four bits of each byte into the high four bits of the next byte. Doing this without converting to bignums is a difficult task involving masking out and shifting the low-order bits of the previous byte to become the high-order bits of the current byte. For example, to implement a shift right of four bits: myBytes1=bytearray([0xFF,0xAA,0x55]) myBytes2=bytearray(3) for i in range(len(myBytes1)): myBytes2[i]=myBytes1[i]>>4 if i>0: myBytes2[i]=myBytes2[i]| In most cases it is preferable to convert to bignums. One-Time PadAs an example of this approach to byte manipulation consider the common task of XORing a set of random bits with a bit pattern. The reason you might want to do this is to encrypt the data. This is a very secure code usually known as a “one-time pad”. You can recover the original data by simply performing the XOR operation a second time as (x ^ y)^y is x. This doesn’t sound very secure, but to decode it you need the random bits to perform the XOR the second time – without the one-time pad it is impossible to recover the original text. Start with a suitable message as an ASCII string: myBytes = b"Hello World Of Secrets" which could have been in the form of a Unicode string converted to an ASCII string. Next we need a one-time pad: oneTime = int.from_bytes(random.randbytes(len(myBytes)), byteorder="big") To understand this you need to know that: random.randbytes(len(myBytes)) generates the specified number of random bytes as a bytes object. We then use the from_bytes method to create the bignum oneTime with the same bit pattern. To XOR the message with the oneTime pad we need to convert the ASCII string to a bignum: msg = int.from_bytes(myBytes,byteorder="big") Now we have both bit patterns as bignums and so can perform the XOR: crypt=msg ^ oneTime To decrypt the message we just need to repeat the XOR: decrypt=crypt ^ oneTime and to see it we need to convert it back to an ASCII string: decrypt=decrypt.to_bytes((decrypt.bit_length()+7)//8,byteorder="big") Putting all this together, and adding some print instructions gives: import random myBytes=b"Hello World Of Secrets" oneTime=int.from_bytes(random.randbytes(len(myBytes)), Of course, in a real application the one-time pad would be available at another site and the encoded message would be transmitted between them securely – usually a difficult task. The one-time pad may be uncrackable, but it isn’t convenient. How would you implement this using direct operations on the byte sequences? The most obvious way to a programmer used to for loops in other languages would be to use a loop index: crypt=bytearray(len(msg)) for i in range(len(msg)): crypt[i]=msg[i]^ oneTime[1] print(crypt) Notice that you need to use a bytearray and not a bytes object because of the need to modify it in-place. A more Pythonic approach would be to use a comprehension: crypt= bytes([a^b for a, b in zip(msg,oneTime)]) This is more compact and arguably easier to understand, but only if you are happy with comprehensions, the zip function, tuples, destructuring and the bytes constructor. In principle it also has the potential to be faster than the index loop approach, but this does depend on the quality of the compiler or interpreter in use. A complete program using comprehensions is: import random msg=b"Hello World Of Secrets" oneTime=random.randbytes(len(msg)) crypt= bytes([a^b for a, b in zip(msg,oneTime)]) print(crypt) decrypt= bytes([a^b for a, b in zip(crypt,oneTime)]) print(decrypt) In chapter but not in this extract
Summary
Programmer's Python
|
|||
Last Updated ( Monday, 05 June 2023 ) |