The Trick Of The Mind - On Being Variable
Written by Mike James   
Monday, 09 May 2022
Article Index
The Trick Of The Mind - On Being Variable
Naming is Hard
Static to Dynamic

Naming is Hard?

Now we have the ability to name locations where things can be stored, we have solved a problem and created one – naming. It is often said that naming is the hardest problem of all, but this is an exaggeration because even if you don’t get it as good as it can be it is often good enough. The problem is you have to find a name that is meaningful, but not so long as to make it difficult to work with. I suppose you could sum it up as – a name has to be short and meaningful. You could also add as an extra condition that the name should not be easily confused with a similar name.

If you have never considered this problem it might seem easy. Back in the early days of computing programmers didn’t care much about naming variables – A and B were perfectly fine. Later, as it became more obvious that programs were written for humans to read and understand and for computers to obey, meaningful names seemed essential.

So instead of a variable called A you might call it Total. This seems like a big improvement and it indicates that the variable is a total, but of what? So how about Totalsales? More informative but what are the units? Let’s try Totalsalesindollars. But which country? So the name is now Totalsalesindollarscanada and so the problem gets worse. As you make the name more specific it gets longer and harder to read.

You can make long names easier to read by splitting the name up. You can’t use a space for this because Total sales in dollars canada could be five different variables not one. One solution is to use a character with a lower visibility – an underscore for example Total_sales_in_dollars_canada.

 

Another option is to use Camel case where the first letter of each word is set to upper case apart from the first word – totalSalesInDollarsCanada. If you capitalize the first letter as well then it is called Pascal case after the programming language in which it was most often used or Upper Camel Case, TotalSalesInDollarsCanada.

This brings us to another difficult point – are names that differ only by case different or the same? Early languages didn’t distinguish between names that differed only in case, but more recently languages have treated them as different. So MikesAge is usually a different variable to mikesAge. Unfortunately this brings us to the problem of confusion. If you have defined a variable called MikesAge and accidentally render it as mikesAge then in most languages you are working with a different variable – not good. Even in a language that doesn’t distinguish between names with different case, you can still have problems with names like Total1 and Total2 if you can’t remember which is which. Meaningful names are a guard against such confusion.

Finally there is the question of abbreviations. At the time a program is written it might be perfectly obvious that that TN and TG are short for Nett total and Gross total, but after a few hours away from the program you might well forget and a new programmer reading it might never guess.

Naming isn’t difficult like rocket science and it almost certainly shouldn’t be listed as one of the most difficult things in programming. It is very important in the sense that a program that uses good names is likely to be much easier to understand than one that uses cryptic names. Indeed using nonsense names is a well known way of “obfuscating” programs so that they are difficult to understand. What is perhaps important is that programmers don’t recoil in awe of the naming problem but they should be aware that they need to put some effort into finding good names – and many are not so aware.

Assignment

Named variables were a big step forward, but programmers really didn’t want to write long sentences to store and retrieve values in variables. Some languages, notably Cobol, did make programmers write long English phrases to get the job done, but this wasn’t, and isn’t, popular. Instead early languages such as FORTRAN borrowed a symbol from mathematics – the equals sign – to make things simpler. Unfortunately, this was a poor decision that gave us a headache that persists to this day.

 

The idea was that instead of:

store 123 in Total

you would write

Total = 123

and instead of:

retrieve the value stored in Total

you would simply write:

Total

In other words, to store a value in a variable you simply used the equals sign and to retrieve a value you simply used its name with the convention that variable names were replaced by the value stored in them. So in this case after the equality instruction Total is the same as 123 in the sense that when you write it the value it contains, 123, is retrieved and used.

While this seems like a wonderful idea in that it lets us write programs that are shorter and easier to understand, it also introduces a trap for the unwary.

Arithmetic Expressions

Once you have the idea of a variable you need to answer the question of what you can store in one. The simplest solution was to allow arithmetic expressions to determine the value to be stored as this gives you something very useful. For example:

Total = 123 + 42

would store 165 in the variable Total. This is attractive because now we can get some real work done and store the result. However, if you think back to the discussion of the difficulties of working with the arithmetic expression little language you will already know that this isn’t a small undertaking. Indeed early languages like Cobol tried to avoid the problem by getting the programmer to write things out in a long-form way. For example:

ADD 123 TO 42 GIVING TOTAL

In the original Cobol there were no arithmetic expressions just simple instructions to ADD, SUBTRACT, MULTIPLY and DIVIDE.

 

You can see that this effectively forced the programmer to do most of the work in translating arithmetic expressions into a form that a computer can carry out without worrying about the order that operations have to be carried out in. As we have already discovered in earlier chapters FORTRAN was the first language to automatically take the rules of arithmetic expressions and convert them in the a simple order of execution. Once this was a solved problem, however, all computer languages adopted the arithmetic expression as the way to get numerical work done.

Once you have figured out how to put arithmetic expressions into a language you can start to elaborate the idea and make things more powerful and simpler at the same time. The first extension of the simple arithmetic expression is to allow variables to be used within the expression. For example:

Value = 123
Total = Value  + 42

and in this case we now have something much more sophisticated. It is still a little language but now this single instruction is equivalent to a whole mini-program. What is says is “retrieve the contents of Value and add to it 42 and then store the result back in Total”. That’s a lot of computing for such a small instruction! You can see that arithmetic expressions have now moved on from their birth in mathematics to something more mature in programming.

It is worth noticing that we now have two ways values can enter a program. Consider again:

Value = 123
Total = Value  + 42

The 42 on the right hand side is a value specified without the use of a variable. It is just 42. This seems so obvious that it hardly seems worth mentioning, but it is important enough to merit a name – a literal. Put simply, a literal is literally what it appears to be, i.e. 42 in this instance. Now consider the use of Value on the right-hand side – this is not a literal as Value is not what it appears to be. It is, of course, the value 123 as stored in it by the first instruction. It is a variable and when you write its name the value the variable contains is used – it is a non-literal. This distinction becomes ever more subtle as we progress.



Last Updated ( Monday, 09 May 2022 )