From Data To Objects
Written by Alex Armstrong   
Thursday, 03 January 2019
Article Index
From Data To Objects
Implementaion
Records

For the record

Arrays are a fundamental tool but there is another… the record or “structure” which is in many senses another generalization of the array.

A record can be thought of as a mixed collection of data whereas an array composed of data all of the same type – e.g. a table of ages.

You can see the problem with the record - its Storage Mapping Function is going to be complicated because each element isn't going to be the same size. However the record isn't generally accessed by index and we need a more sophisticated SMF anyway.

The archetypal record is the name and address card.

Here you have a data structure which is composed of a name, an address and a telephone number, say, and these are three different types of data, not the same thing repeated three times as in an array.

Usually a record is defined using qualified names or fields rather than an index. For example, the record JOHN might consist of three “fields”:

JOHN.NAME=”John Doe”
JOHN.ADDRESS=”1 Fortran Drive”
JOHN.TELEPHONE= “12345”

You can think of this as a sort of array made up of three variables. The entire record is just called “JOHN” and you can refer to a specific field by adding the field name to the record name:

JOHN.TELEPHONE=IPROG.TELEPHONE

You can see that the record is just the computer equivalent of the old-fashioned card record that you would find, in fact do still find, in almost any office.

You may notice that you can’t simply run through all of the fields of a record like you can an array, but this doesn’t usually matter in practice because as each field is different you generally don’t want to do the same thing to each one. That is you don't often want to iterate though a record and you don't want to enumerate a record either. 

Record fields are generally processed within a program one at a time by name. You can think of this as "random access only" if it helps. It is also worth saying that often the ability to process a record in sequential fashion is highly desirable and its a big pain when the language in use doesn't support it. 

Records started off life as part of business-oriented languages such as Cobol but slowly they moved into mainstream general programming as ways of storing complicated items of data. 

For example, instead of using two variables, x and y to store the co-ordinates of a point on the screen, it is common practice to define a record to do the same job:

point.x=10
point.y=20

This allows you to write statements such as:

point1=point2

There is also another important innovation in the way that the records were used. Before you could use a record you had to provide a definition. That is if you wanted to work with a point record you first had to declare that it had two fields - one called x and one called y both integers. Once you had the definition you could use this to create as many instances of the record as you wanted.

For example:

record pointType
 int x
 
int y
end

might be used to define the new record type. Then when you needed an example or an instance of the type you would write something like 

pointType myPoint

and after this you can write:

myPoint.x=10

and so on.  You can use pointType to create as many instances of the record as you like. 

This is directly analogous to having a float or int variable type and then stamping out as many float or int variables as you needed. In other words you had to declare a record type before you could create instances of the type - this is the first place that the idea of extending the data types a language supported was introduces and it is a very important idea.  It is also important that it split the use fo a custom record into two steps - first define the new record type and then create instances of it. 

As long as the language that you are using makes full use of records, or structures as they tend to be called, when used in this way, this can be a very useful way of working but what you might not realise is that this simple idea leads on to probably the most important concept in 21st century computing – even if it was thought up in the 20th century!

Objects

The idea is based upon the desire to integrate structures into the language and programs as if they really were part of the original language.

For example, if you define a point structure then you might well want to define an operation of showing a point on the screen. Something like:

show point

or as a function

show(point)

In traditional programming terms this corresponds to having a command “show” which can work with the new type of data you have just introduced, i.e. the point structure. The problem is how do you add a “show” command that knows what to do with “points”.

There are a number of possible answers but the best one that we have thought up to date is to let the “point” structure know how to show itself. To do this we have to extend the idea of what a record is.

We are going to allow a record field to be a procedure or function and not just a chunk of data. In simple terms a procedure/function is a list of instructions or a small chunk of program.

For example:

point.show()

runs the small chunk of program defined as part of “point” that changes the colour of the pixel at point.x, point.y and hence “shows” it.

This idea can be elaborated so that there is no need for any code that doesn't live outside of a struct - all of the code in a program can be made part of a set of data structures. 

Notice that we have changed

show(point)

into

point.show()

in the first case we have a general show function and we have to tell it which point to show. In the second we simply call the show method that belongs to the point. The show method works on the point that it belongs to. You can see that this is a very simple change in the way something is written - syntax. However thinking about things in this way is very useful.

This is a very clever idea and it was first thought of in the early 1960s by Ole-Johan Dahl and Krysten Nygaard as part of a new computer language called Simula. The language may not have caught on but the idea most certainly did.

If you haven’t recognized it then I’d better tell you that a record with procedural fields or "methods" is called “an object” and the whole idea is called “object-oriented programming”.

You might also recognize the way that the definition of the record and the instance of the record are generalized. We tend to call a record definition that has methods, i.e. fields that are code, a class and an instance of the type is and object or instance of the class. 

So the record, today more commonly called a struct, is the start of not only of object oriented programming but the particular approach to it based on classes and on extending data types. This is not the only possible approach but it is the dominant one we encounter today.  Objects are data that knows how to do things to itself. You may know or learn later more sophisticated justifications for object oriented programming in terms of modeling the real world as hierarchies of object but this is a much more basic reason for using objects. If you want a point to show itself - just ask it. If you want a circle to show itself - just ask it and so on.

So the truth is that objects are just records that allow code and data to be stored on an equal footing - I told you data structures were important. 

Related Articles

Introduction Stacks And Trees

Data structures - Trees

The LIFO Stack - A Gentle Guide

Hashing

Storage Mapping Function

Advanced Hashing

Variables revisited

Stack architecture demystified

Reverse Polish Notation - RPN

Brackets are Trees

Javascript data structures - Stacks

 

What Programmers Know

knowcover

Contents

  1. The Computer - What's The Big Idea?*
  2. The Memory Principle - Computer Memory and Pigeonholes*
  3. Principles of Execution - The CPU
  4. The Essence Of Programming
  5. Variables - Scope, Lifetime And More*
  6. Binary Arithmetic
  7. Hexadecimal*
  8. Binary - Negative Numbers*
  9. Floating Point Numbers*
  10. Inside the Computer - Addressing
  11. The Mod Function
  12. Recursion
  13. The Lost Art Of The Storage Mapping Function *
  14. Hashing - The Greatest Idea In Programming
  15. Advanced Hashing
  16. XOR - The Magic Swap*
  17. Programmer's Introduction to XML
  18. From Data To Objects*
  19. What Exactly Is A First Class Function - And Why You Should Care*
  20. Stacks And Trees*
  21. The LIFO Stack - A Gentle Guide*
  22. Data Structures - Trees
  23. Inside Random Numbers
  24. The Monte Carlo Method
  25. Cache Memory And The Caching Principle
  26. Data Compression The Dictionary Way
  27. Dates Are Difficult*
  28. Sequential Storage*
  29. Magic of Merging*
  30. Power of Operators
  31. The Heart Of A Compiler*
  32. The Fundamentals of Pointers
  33. Functional And Dysfunctional Programming*

* Recently revised

espbook

 

Comments




or email your comment to: comments@i-programmer.info

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Multitasking

We take multitasking for granted now but it was a difficult technology to get right - and still is. We take a look at how it all developed and at the variations on the basic idea.



Bus Basics

Buses are everywhere and yes when you are looking for one they tend to come in threes! With that joke out of the way, let’s take a look at what a bus is in general and in particular.


Other Articles

 

 

<ASIN:0672330164>

<ASIN:0131409093>

<ASIN:0521670152>



Last Updated ( Thursday, 03 January 2019 )