Just JavaScript - Type And Non-Type |
Written by Ian Elliot | ||||
Thursday, 15 January 2015 | ||||
Page 1 of 3 What is the fuss about strong typing really all about? JavaScript doesn't make much use of type so what is it missing? What is more difficult to do in JavaScript than in a typed language? Are there things that are easy in JavaScript because it isn't strongly typed? Just JavaScriptThere is a newer version of the draft of the book here. A Radical Look At JavaScriptContents
-Preface-Most books on JavaScript either compare it to the better known class based languages such as Java or C++ and even go on to show you how to make it look like the one of these. Just JavaScript is an experiment in telling JavaScript's story "just as it is" without trying to apologise for its lack of class or some other feature. The broad features of the story are very clear but some of the small details may need working out along the way - hence the use of the term "experiment". Read on, but don't assume that you are just reading an account of Java, C++ or C# translated to JavaScript - you need to think about things in a new way. Just JavaScript is a radical look at the language without apologies.
JavaScript doesn't make use of the idea of type much at all but this doesn't stop programmers trying to implement something that looks like type. In a dynamic language like JavaScript this isn't a good idea. The whole concept of type belongs in a very different language structure. This raises the question of what exactly is type and what is the problem it is supposed to solve? Before we can examine how JavaScript deals with the problems that type solve we need to make sure that we are 100% clear what type and strong typing is all about. If you are happy that you understand type, type hierarchies and what they do for you then skip to the next chapter which deals with how JavaScript can cope without them. Three Types Of TypeEven before we get started it is important to realize that there are a number of meanings to the word "type". The most common usage of the word means primitive data type and after this it refers to the "class" that defines an object. Exactly what these two meaning of type are all about will be explained in more detail later. For the moment it is assumed that you have a rough idea what a primitive type is e.g. and int or a string say and you have an idea what a class based type is i.e. an instance of a class. There is also a third meaning which corresponds to algebraic data type and this is a much deeper almost philosophical idea, the Howard Curry correspondence, that really doesn't have much to do with JavaScript or the class based languages that most of us know. Such ideas of data typing are part of pure functional programming as found in languages such as Haskell and there is a lot to say about them but it isn't really mainstream in the sense of languages like JavaScript, Java, C++, etc. For the rest of this article type will be taken to mean either primitive data type or class based type. What Is Type For?So what is type for? Put simply type tells you the operations that you can perform. By declaring the type of an object you specify exactly what operations you can use and what methods you can call. If you know that x is of type integer then you know that it is fine to perform x+3 and not only do you know it but the compiler knows it as well. This allows the compiler to detect incorrect code and flag type errors at compile time - thus saving you from the embarrassment of a run time error. In the case of an object knowing that an object is of a particular type defines precisely what methods and properties it has. This allows the compiler to check at compile time that you aren't calling any methods or accessing any properties that the object doesn't support. Again it is saving you from the embarrassment of a run time error. This is a worthwhile idea but it also limits what you can do and forces you to introduce other mechanisms to get over the restrictions strong typing brings with it. There is a trade off. When you accept strong typing and type checking it becomes possible to find some types of error but these errors are fairly easy to find in other ways. After all you simply have to check that every operation on an object is legal. If this can be done at compile time then it can be done by reading the code. This approach is often referred to as "type inference" but you could just as well call it "property checking". If you do adopt strong typing what you lose in return are ways of working that are type free - for example generic algorithms - and most language have to invent complicated ways of restoring these features e.g. generics, covariance, contravariance and so on. In short type checking finds errors that are mostly easy to find and places restrictions on what you can do. Because of languages such as Java, C#, C++ and so on most programmers are taught the strong typing is nothing but good and in fact you can't develop quality software without it. This point of view is far from proven and to have a balanced view you really need to see both sides of the coin. So we have two general meanings of the word type - primitive type and class based type. The first and most basic relates to primitive data type. In many ways this is the least important meaning of type - but it leads on to the more sophisticated - class-based data type. Primitive Data TypingPrimitive data typing is so ingrained in most approaches to programming that it is difficult to see it afresh and consider its implications. In JavaScript the attempt to get away from primitive data typing is a bit of a half finished mess which confuses many beginners and experience programmers alike. It gives rise to the seemingly complicated set of data coercion rules that cause data of apparently irreconcilable different types to be automatically converted without the programmers intervention. For the moment try to ignore the imperfect way JavaScript deals with primitive type and concentrate on the principles. The idea of primitive type is deeply embedded in nearly all programming languages and hence in the minds of most programmers. As a result the alternative position of trying to eradicate all primitive data types which is outlined below is usually met with a great deal of resistance. Try to keep an open mind. Historically this is where the whole idea of type originated because it was necessary to make the distinction between different types of data for reasons of efficiency. As time has passed it has become less and less necessary to worry how data is actually stored and computer languages have become increasingly abstract and removed from the constraints of hardware. The point is that what we program should never depend on the low level detail of how the bits are stored - and as long as it does we are still in a primitive state of development. The object of any high level language is to abstract away from the reality of hardware, bits and representations. In an ideal world we wouldn't worry too much about low level concepts such as primitive data type because the language would take care of everything. Instead of worrying about the format that data is stored in you would concentrate on the operations that you apply to data. This is a difficult idea to take on board because as programmers we know for a fact that a number is stored in one way and a string say is stored in another. In fact we even know that numbers - integers and floats - are stored in different ways. These differences are so ingrained that it is difficult to see or agree with the assertion that this isn't necessary. You may say - yes it is - because numbers are different from text and you need integers and reals and so on. No you don't - you simply need operators that do the right job. Most ideas of primitive type are in fact the result of not defining operators in the correct way. For example the addition operator would expect to work with two numbers and the concatenation operator would expect to work with two strings. The form that the data is stored in shouldn't matter and in this sense 123 and "123" are numbers - a string that happens to represent a number is a number. In the same way the concatenation operator would treat any number as a string. What if you try to add a string that doesn't represent a valid number? You get an error what else could happen, but throwing an error when the string does represent a valid number is ignoring what is in front of you. The representation of data should be irrelevant to the operation of a program. This is how humans work with data and we are the pinnacle of sophistication and abstraction. If I ask you to add 1 to 2 you don't worry about the data representation - are they strings or integers or what? Clearly they are valid numbers so you get on and add them. Similarly, if I ask you to concatenate 123 at the end of the "some string of words" then you just do it. Again you don't worry about the difference between text and numeric values. It is not the data representation that matters but the operation. From the point of view of sophisticated abstraction the complete beginner is close to the ideal when they complain that there really is no difference between 123 and "123".
and
In an ideal world for example the addition operator and the concatenation operator would be distinct - + and & say. When you write a+b this would be addition irrespective of what a and b are and a&b would be text concatenation. Notice that a+b might throw an error if either a or b could not be interpreted or coerced to a number - this is reasonable. Throwing an error in any other case is being too picky and yet many programmer believe that this is what should happen "you can't use a string as if it was a number". Sadly in JavaScript things aren't so pure. The two operators are both represented by + and what a+b means depends on the primitive types of a and b - this is not good. What about the logical distinctions between integers and reals? Surely this is a data type distinction that is founded in mathematics? Mathematically there are integers, rationals and irrationals. Some math programming languages have all three types but in most cases all we need is one type of number that can be either an integer or a decimal rational as the need arises. Integer and rational operations need only to be built into the operators not the data. This is how things work in JavaScript. There is a single numeric data type and how this is treated - integer or real - depends on the operator you use. The subject of JavaScripts unified number object is something that will be explored in a future chapter. You also don't need notions of int32 or int64 to work with integer arithmetic - you simply need an integer division operator. In fact ideas such as int32 and int64 reveal that there are many languages that are far too bound to the hardware to be regraded as modern. All we really need is the notion of number - any precision necessary and text - any number of characters. We might also need a Boolean type and here we get into very complicated matters best discussed in another chapter dealing with Truthy and Falsy. This viewpoint that primitive data types are irrelevant isn't particularly popular - mainly due to many years of exposure to low level languages such as C and C++ that have institutionalized the notion of primitive type. In JavaScript and many other languages everything is an object and expressions take objects, combine them and return a new object. The operators take the appropriate "value" of each object and use it to create the new object. If you want a fuller explanation see Just JavaScript - The Object Expression. From this view point and in an ideal world there should be no primitive data types - just objects and operators.
|
||||
Last Updated ( Sunday, 10 May 2015 ) |