The Trick Of The Mind - Debugging As The Scientific Method |
Written by Mike James | |||||
Monday, 12 December 2022 | |||||
Page 3 of 4
The DebuggerYou can see that tests involve checking what is happening as the program runs. The simplest way of doing the job is to add instructions that display the values of variables and trace out exactly where the program is at any point in its execution. The big problem with this approach is that, after putting lots of debug statements into a program, you have to remove them when the bug is found. This is error-prone and is likely to introduce yet more bugs. A better idea is to use a special program called a “debugger” to run the program. This executes the program under your control. You can generally single-step the program, i.e. carry out one instruction at a time, and view the values of variables before and after each step. In addition you can also generally set an instruction to make the program will pause, a “breakpoint” so that you can examine variables. Running a program using a debugger can be both useful and educational. You can learn a lot from tracing through a program you think you understand. Most modern languages have a debugger and not using one if available is potentially a huge waste of time. Of course, debuggers don’t exist in the wider world of non-programming problems. Solutions to problems that are not expressed as programs have no really general systematic way of being debugged, but in any real world problem you need to find ways of investigating what is actually happening. This is usually called “instrumenting the problem” or “instrumentation” and it is the generalization of the program debugger. When designing a system to solve a problem, it is usually worth spending extra time to work out how instrumentation can be included or added later should “debugging” prove necessary. Test Driven DesignThere is another, uniquely program-oriented, philosophy related to the debugging paradigm. Some programmers are of the opinion that whenever any routine is written it should be furnished with a set of tests that it has to pass to ensure that is working. You can think of this as a modification or addition to top-down design. In this case each time you define a subroutine at any level of the hierarchy you also write tests the prove that it does what you claim it does. This is usually referred to as Test Driven Design or TDD. Why is this a good idea? It is useful because if in the future any changes are made to the program at any level then the tests can be rerun and if any of them fail you know that that the changes have invalidated your original program, i.e. there are bugs. This is fine and seems very sensible, but there are problems, not often recognized by practitioners. The first is that creating tests that are more than trivial is difficult and requires as much, if not more time than implementing the routine to do the job. Then there is the issue of what sort of test constitutes useful protection. For example, if you have a function that adds two numbers together: function Add(A, B) Return A+B you might think that a bulletproof test would be: assert A+B == Add(A, B) but this offers little protection. It really only tests that the function is using + and not some other operator. If, for some deep reason, the function is wrong it will still pass the test, which only provides a different way of performing the addition operation. A better test is: assert if A>0 and B>0 Then Add(A, B) > A and This simply tests to make sure that the addition of two positive values gives a larger value and doesn’t depend on doing addition to check the addition in the function. This is subtle because the test must ascertain that the function is doing something right but can’t make use of anything that might be subject to the same errors as the function. Writing good tests when you are not looking for a specific fault is difficult. You can put a lot of effort into writing tests that give a false sense of security and the effort put into creating tests might well be better spent elsewhere. However, when programs are being constructed by a large team of programmers, devising tests and passing tests is often regarded as essential as a basic “sanity” check to make sure that some part of the programming group hasn’t made changes that break the program in some fundamental way. Even in this case, passing the suite of tests isn’t proof that the program works. Writing tests as you create a program is a lot like bottom up design. You are essentially writing a selection of tests that might prove useful which is very similar to building a toolkit of subroutines which may or may not be used. Tests created in debugging is more like top-down design as you are creating only the tests you need to find the problem. Edge CasesSo where are the places to spend the extra effort? One answer is look to “edge cases”. What are edge cases? They are the conditions under which the program is operating at its extremes, something that takes into account that programs generally work for inputs that are “usual” but fail for inputs that are less common. For example, when the binary search function works for 10 books, it is likely to work for 11 books, but it isn’t so obvious that it will work for 0 books or even for 1 book. The values 0 and 1 are the edge cases because a shelf with one book on it doesn’t really have a set of books to its left or its right. For zero books we have an extremely out of the ordinary condition - how can you even test for your target book with zero books! Programs that you know will work for a single non-edge case generally will work for non-edge cases that are close to the case you have examined. The only really possibility is that the program might fail because it runs out of resources. For example, a binary search that works for 1,000,000 books might not work for 1,000,001 books because it runs out of shelf space. This is an example of obscure edge cases that you have to work hard to notice. Standard edge cases are much easier to spot because they don’t conform to the assumptions used to create the program. As well as edge cases, there may be conditions that go beyond the expected operation conditions of the program in more than one way – these are “corner cases” - situations where multiple conditions are encountered at the same time usually in combinations that the programmer hadn’t considered. For example, consider a simple game involving a ball that bounces around the screen. When the ball hits a horizontal boundary its vertical velocity is reversed. When the ball hits a vertical boundary its horizontal velocity is reversed. These two simple rules ensure that the ball bounces correctly whenever it meets a barrier. They are literally edge conditions and when testing the program you would be much more interested in what happens at the edges than in the main body of the screen. If the ball behaves correctly at one point in the middle area of the screen it is likely to continue to behave correctly at other points that are not edge points – the movement across the screen is essentially homogeneous and boring. Only when the ball reaches the edge of the screen does something different and hence potentially error-prone occur. Could this be where the term “edge” case originated? When you test this program, where do you think the corner cases are? Yes, at the corners. At a corner both conditions are triggered and what happens is less clear. If you write the conditions correctly then the ball will bounce at the corner and both its vertical and horizontal direction will be reversed. Is this where the term “corner” case originated? Once you start to notice edge and corner cases they become ever more visible to you. In the real world, away from programming, the ability to notice edge and corner conditions and ponder what happens is an important ability. It also accounts for why there are some annoying people who can crash a program under test in no time at all. They just notice the exceptional cases that the programmer has likely missed. |
|||||
Last Updated ( Monday, 12 December 2022 ) |