The Trick Of The Mind - Representation
Written by Mike James   
Monday, 18 July 2022
Article Index
The Trick Of The Mind - Representation
Strings and Arrays
The Record

We have only bits? How can sets of bits represent anything we want to work with inside a program? The answer is that everything is a number.

The Trick Of The Mind - Programming & ComputationalThought

Buy Now From Amazon

Trick360

Chapter List

  1. The Trick Of The Mind

  2. Little Languages
       Extract: Little Languages Arithmetic

  3. Big Languages Are Turing Complete

  4. The Strange Incident of The Goto Considered Harmful
       Extract: The Goto Considered Harmful

  5. On Being Variable  

  6. Representation ***NEW!!

  7. The Loop Zoo

  8. Modules, Subroutines, Procedures and Functions

  9. Top-Down Programming

  10. Algorithms

  11. The Scientific Method As Debugging

  12. The Object Of It All

 <ASIN:1871962722>

<ASIN:B09MDL5J1S>

What can you store in a variable? The answer is a number. This seems very limited but in practice you really don’t need anything else. What is surprising is that everything can be represented using nothing but numbers but sometimes you need more than just one variable. This is a powerful idea that lies at the heart of programming, mathematics and science in general. The only question to be answered is how does everything become represented using just numbers?

Numbers Are Finite

If I give you a piece of paper and ask you to write down a 100-digit number you might be able to just fit everything on. If I upped the challenge to a 1000-digit number you would have to write smaller. In the case of computers there is no “write smaller” - each variable that you use has an upper limit on the number of digits it can store and retrieve. You might know that computer memory is measured in bytes and a single byte can store a whole number up to 255. This isn’t very big and to make things more useful we generally use more bytes to store values in a variable. Python even uses a system that will expand the number of bytes used if the number grows during a calculation, but this is unusual and it only applies to integer whole number values, not values with fractional parts. If course if the number gets too big then the computer runs out of memory to store it and we have a problem.

In most cases the size of a variable is fixed to a small number of bytes usually four or eight – and if you try and store a number that is too big the result is an overflow error which causes all sorts of serious problems and is a major cause of computer bugs, crashes and disasters. It is not an exaggeration to say that people have died because of overflow in programs that are working in mission critical hardware such as medical devices and military weapons.

In many ways the upper limit on the size of a number is something we can mostly ignore as long as the limit is big enough. This is probably the reason that overflow remains a serious problem, but for the sake of simplicity we can continue to ignore it.

Representing Characters

So we can accommodate whole numbers, but we want to write programs that do more than just work with numbers – we want text. How can we work with text if all we can store are numbers? The answer is very simple – assign a number to each of the letters of the alphabet. For example, A is 1, B is 2 and so on allows the values:

08 05 12 12 15 23 15 18 12 04

to represent:

HELLOWORLD 

If you want to include a space between the two words we could allocate the code of 0 to a space character – and, yes, space is a character. You may ignore spaces when writing on paper, but with a computer you can’t take spaces for granted. With 0 representing space we have:

08 05 12 12 15 00 23 15 18 12 04

representing

HELLO WORLD

There are many possible representations of character sets. Until recently, the most common was ASCII American Standard Code for Information Interchange. In this code A is 65 and lower values represent digits, punctuation and control symbols. For example, 32 is a space character and 07 is Bell, and yes it used to ring the bell on old teletype machines and it still makes a “beep” on most modern machines.

The problem with the ASCII representation is that each character only uses a single byte of storage and this means that we are limited to 512 characters. In fact, true ASCII doesn’t even define this many characters with 128 standard codes. Today we have mostly moved on from ASCII to using Unicode which uses values from 0 to 1,114,112. Not all the available values are defined as characters or symbols but more are being added all the time and the first 512 characters are the same as extended ASCII for backward compatibility.

For obvious reasons computer languages don’t make you type in or read text as numbers – they automatically convert to the representation in use. For example you can write:

“Hello World” 

and it will be converted to the set of values appropriate for the representation in use – usually Unicode. This allows most people to simply ignore the idea of representing text as numbers for most of the time.

Notice that

“Hello World” 

is another example of a literal – see the previous chapter. It is exactly what it looks like – the characters that form the text Hello World. In this case we need the quotes around the literal because without them Hello World would look like the names of two variables – Hello and World.



Last Updated ( Monday, 18 July 2022 )