Brackets are Trees
Written by Mike James   
Wednesday, 14 December 2011

Brackets are at the heart of programming. Understand brackets and you can rule the earth - no, seriously! Brackets, trees and stacks are all interconnected in a very deep and fundamental way.

Back in the early days of programming, long before Google started asking why manhole covers are round, one of the very common programming aptitude tests was to take a long mess of brackets and ask the candidate if they were valid.

Something like is:

(((((()))((()))))))((()))))))

a valid expression?

So brackets are fundamental to programming and being able to handle brackets is a fundamental programming skill. You might well think that this marks a great divide, a sign of progress in the programming world but the truth is that brackets really are central to all programmatic thought.

The simple reason why a bracket is so great is that it forms a tree structure as soon as you give it a chance. Hence the title - brackets are trees.

The fact that brackets are trees might not seem obvious so lets see how this works.

A single bracket is a useless thing. It only comes into its own when paired with a bracket of the opposite handedness. The reason that a ( goes with a ) is simply that together they form a container. Put simply you can put something in between the brackets - (data). Usually the data that you put in the brackets has some sort of structure but this isn't really relevant to the key property  of brackets. However even at this stage you can see that brackets can be used to represent an array:

(first) (second) (third) (fourth)

What makes brackets really useful is that the data that you place between a pair of brackets can be another pair of brackets. This is always the moment when a simple data structure turns into a complex one - let a one dimensional array element store another array and the result is a two dimensional and more array. If you allow a bracket pair to store another pair of brackets then the result is a tree.

For example:

(()())

corresponds to a binary tree with two terminal nodes:

  |
/\
(()())

You can now construct examples as complicated as you care to make them. For example:

        |        
/ \
/\ \
/ \ \
/ \ \
  /\ /\ \
((()())(()())())

The nesting of brackets simply gives the structure of the tree that the brackets correspond to.

Now you can see why many computer languages use a lot of brackets. Probably none more so than the infamous Lisp and at this point who could resist quoting the well known xkcd cartoon:

 

Lisp Cycles

For more xkcd cartoons click here.

 

Lisp can get away with adding very little additional machinery to the bracket to create a complete and powerful language.

The fact that brackets form a tree structure also explains the strange and arbitrary rules that you had to learn in school concerning arithmetic. The rule of arithmetic expressions form a "little" language the grammar of which can be expressed as a very simple tree structure with the rule that you always work out the expression by doing a left to right depth first walk.

Consider the expression 3+2*5. To make sense of this and evaluate it correctly we have to invoke the idea of operator precedence. In particular we have to say that multiply has a higher precedence than addition so that the expression is 2 times 5 plus 3.

However if this expression had been written as ((3)+((2)*(5))) then no operator precedence rules need to be used.

The bracketed expression corresponds to the tree:

      +
/ \
/ *
/ / \
((3)+((2)*(5)))

and you can see that walking the tree in depth first left to right order and performing the operations indicated at each of the nodes gives you the correct evaluation.

In other words if you are prepared to put all the brackets needed into the expression to make the syntax tree of the expression clear and unambiguous you don't need to introduce the idea of operator precedence.

Of course we prefer to leave out brackets and complicate things by claiming that multiplication has to be done before addition. In fact we leave out so many brackets that we have to use the "My Dear Aunt Sally" rule - i.e. Multiplication and Division before Addition and Subtraction. 

Perhaps the biggest use of brackets today is in the form of all those markup languages that generate hierarchies i.e. trees of visual objects. For example what is HTML other than a bracketing syntax -

<open tab>content</close tag>

Just think of each open tag as a "(" and each closing tag as a ")".  The same is true of XAML and all the other object instantiation languages. They all create tree structures. In a more general context XML is a bracketing system that generates general tree structures consisting of arbitrary data. For another example consider the nesting of control structures such as for, if, do, while and so on. These too are a bracketing language, often using curly brackets {}, and they generate a tree structure which the compiler has to work out to successfully parse the language.

Once you notice the way bracketing generates hierarchies and general tree structures you start noticing them more or less everywhere.

Perhaps now you will agree that brackets are fundamental to programming and testing the ability to work with them is probably a good way to see if some one is going to make the grade as a programmer.

We have one question remaining - what arrangements of brackets are legal?

The simple answer is only those arrangements that correspond to complete tree structures and there are only two ways in which a set of brackets can fail to do so. The first is just not having the same number of opening and closing brackets. You can't have half a container and so all valid bracketing structures have to match numbers of opening and closing brackets. This is a minimal condition for legal brackets.

The second condition is that the pairs of brackets always occur in the right order - that is you always have () and never )(. Put as simply as this you would think that this condition is trivial but it is very easy to hide a pair of brackets in the wrong order.

For example,

())(()()

This bracketing structure is clearly wrong but there is no single answer to exactly what is wrong with it. For example you might say that it corresponds to any of the following groupings:

() )( ()()
() )( ( )( )
( ))(()( )

In other words there is no single correct way to parse an incorrect bracketing structure.

So how do we check for a valid bracketing structure?

There is more than one answer to this but the simplest is to see if it is possible to walk the tree that the brackets describe. To do this you need a stack. All you have to do is scan the bracket expression from left-to-right. Each time you encounter an opening bracket push it on the stack. Each time you encounter a closing bracket pop the top of the stack - if there is nothing on the top of the stack to pop then you have an invalid set of brackets. If you reach the end of the scan without trying to pop an empty stack then the stack will be empty if the expression was valid.

For example in the case of:

((()()))

the stack contents are:

  stack    remainder 
0 . ((()())).
1 .( (()())).
2 .(( ()())).
3 .((( )())).
4 .(( ())).
5 .((( ))).
6 .(( )).
7 .( ).
8 . .

As the stack is empty when the scan is complete the expression was valid.

The real question is can you see that this algorithm works?

The answer is that the stack algorithm performs a depth-first left-to-right tree walk and this can only be completed if the brackets really do specify a tree.

Notice that we are only considering brackets that are unlabeled - that is any closing bracket will pair with any opening bracket. Things become a little more difficult if you allow named brackets as in HTML, XML or program language structures. Then there are other ways to get things wrong such as <div><span></div></span>. You can easily modify the stack algorithm to detect such errors - all you have to do is make sure that the popped bracket matches the closing bracket that caused the pop operation.

Brackets, stacks and trees go together perfectly.

The final word however should go to xkcd and the observation in its blog of a halcyon time when Wikipedia had a sense of humor. The blog quotes from the Wikipedia article on Brackets:

 

Parentheses may also be nested (with one set (such as this) inside another set). This is not commonly used in formal writing [though sometimes other brackets (especially parentheses) will be used for one or more inner set of parentheses (in other words, secondary {or even tertiary} phrases can be found within the main sentence)]

 

Sadly the Wikipedia entry no longer contains this paragraph...

 

Banner

 

To be informed about new articles on I Programmer, subscribe to the RSS feed, follow us on Google+, Twitter or Facebook or sign up for our weekly newsletter.

Last Updated ( Saturday, 17 December 2011 )
 
 

   
RSS feed of all content
I Programmer - full contents
Copyright © 2014 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.