Programmer's Python Data - A Custom Data Class
Written by Mike James   
Monday, 15 January 2024
Article Index
Programmer's Python Data - A Custom Data Class
Node Numbering
Equality and Hash
The Complete Tree

Node Numbering

You certainly don’t have to know how the tree indexing actually finds a node to understand how we have implemented indexing and slicing – it is complicated, but the solution is interesting.

The key idea is that given any index number then you can easily work out which row or level the node is in. As there are 2row nodes in each row you can see that a node with an index of k is in row = int(log2(k+1)). The plus one is needed to allow for rounding.

Now that we know the row that the node Tree[k] is in, we can work out how to find it moving down the tree from the root. If you look at the diagram showing the value of k for each node you can see that if you are in row 1 then k=1, odd, means go left and k=2, even, means go right. If you look at all of the nodes you can see that odd nodes are left nodes and even nodes are right nodes.

A less obvious fact is that, given node tree[k] in row r, you can work out the index of the node’s parent as int((k-1)/2). For example, tree[7] has node tree[int((7-1)/2), i.e. tree(3), as its parent. You can see that this is true for each node k in any row. That is, the parent of node tree[k] is tree[int((k-1)/2)].

Using this relationship you can work your way from a node to each of its predecessors until you reach the root. For example, starting from tree[9] you get tree[4], tree[1] and finally tree[0]. You can also see that this means that to get to node tree[9] you start from the root and go node left, node right, node left as determined by the odd or evenness of each node index.

Putting all this together means that you can take a node index and work out the index of each of its precursors. Reading these from the root we can work out how to follow left and right nodes to reach the target node simply by looking to see if the parent’s index is odd or even.

Using the formula that gets from an index in row r to the parent’s index in row r-1 to work out a formula that gives the index of the parent in any earlier row gives:

index of parent in row r of node k = int( (k+1)/2row-r – 1)

where row= int(log2(k+1)).

For example:

parent of node 7 in row 1 is int(8/22-1) = 1

parent of node 7 in row 2 is int(8/21-1) = 3

Using these formulas we can now write:

def getNode(self,key):
        if key>=len(self) or key<0:
            raise IndexError('Index out of range') 
        if key==0:
            return self.root
        row=int(log2(key+1))
        currentNode=self.root
        for r in range(1,row+1):
            k=int((key+1)/2**(row-r)-1)
            if k%2:
                currentNode=currentNode.left
            else:
                currentNode=currentNode.right
        return currentNode

At the start of the function we raise the IndexError exception if the key is greater than the current length of the Tree. We have to deal with key==0 as a special case because the log of zero is undefined. After this, the general algorithm works for any index. The search for the indexed node starts from the root and moves to the left or right node depending on whether the computed index of the parent in each subsequent node is odd or even. If any of the nodes on the way turn out to be None then the node we are seeking doesn’t exist in the tree and an exception is raised.

Notice that enumerating the nodes using this scheme is equivalent to a breadth-first left-to-right traverse.

Custom Operators

While some languages make it hard or almost impossible to implement custom operators for custom data types, this is something that Python makes easy.

There are a range of __operator__ functions which are called when an object is involved in an arithmetic expression. For example, __add__ is called whenever your custom object appears in an arithmetic expression like myObject + number. The number is passed to the object as a parameter and the return value is the result of the expression. That is, x + y is implemented as x.__add__(y) and the expression y + x is implemented as x.__radd(y)__ if y.__add(x)__ can’t perform the addition.

There are methods for all of the standard mathematical operators, bitwise operators and relational operators. If you want your custom data type to engage in expressions then defining these methods is a good idea, but it is time-consuming.

In the case of our Tree example it is difficult to see how to add or multiply two trees, but multiplying a tree by a number seems to have a reasonable interpretation in multiplying each of its node values by that number. As this is best implemented as an “in place” operation and its result isn’t a numeric value it makes sense to define the *= operator method __imul__:

def __imul__(self,m):
    if isinstance(m,numbers.Number):
        for i in range(0,len(self)):
            self.getNode(i).value*= m
    return self

After adding this to the Tree class you can write things like tree*=10 to see all of the tree’s values multiplied by 10. You could define similar operators for addition, subtraction and division. One of the problems with defining operators for custom data is that the user tends to assume that if one operator is defined then similar operators can also be used.

 



Last Updated ( Monday, 15 January 2024 )