Programmer's Guide To Theory - Where Do The Big Os Come From |
Written by Mike James | ||||||
Monday, 16 March 2020 | ||||||
Page 2 of 2
Finding the Best AlgorithmWe have reached a stage where the reasoning undergoes a subtle change that is often neglected in textbooks. Up until now we have been talking about algorithms belonging to P, but we would really like to answer a slightly more general question. Is it possible to find an algorithm that achieves some result in polynomial time or is such a thing impossible? This shifts the emphasis from finding the order of any given algorithm to finding the order of the best algorithm for the job, which is much more difficult. You can also ask what the best algorithm is if you specify your computing resources - a Turing machine, for example, or a finite state machine and so on. There are algorithms that run in P for some classes of machine but not for others. This is where things can get complicated and subtle. For the moment we will ignore the type of computer used for the algorithm, apart from saying that it is a standard machine with random access memory - a RAM machine. It is interesting to note, however, that algorithms that run in sub-polynomial time can change their complexity when run on different types of machine. How Fast Can You Multiply?So the really interesting questions are not about how fast any particular algorithm is, but what is the fastest algorithm for any particular task. For example, it isn't difficult to show that multiplying two numbers together using the usual “shift and add” algorithm is O(n2) where n is the number of bits used to represent the numbers. This means that multiplication is certainly in P, but is O(n2) the best we can do? Is there an O(n) algorithm? Certainly you can't produce an algorithm that is faster than O(n) because you at least have to look at each bit, but can you actually reach O(n)? Before you spend a lot of time on the problem, I'd better tell you that it is difficult and the best practical algorithm that anyone has come up with is the odd-looking O(nlogn loglogn), which is a lot better than O(n2) but still not as good as O(n). A recent paper has proposed a O(nlog n) algorithm, but for it to be worth using n has to be a huge value, well beyond anything reasonable, see Galactic Algorithms in Chapter 17. Is there a faster practical algorithm? Who knows? You would either need a proof that said that it was impossible to perform multiplication faster than the stated algorithm or you would need to demonstrate a faster algorithm. You can see that in a very deep sense the order of the best algorithm tells you a lot about the nature of the task being performed. Is it just an easy task that happens to look difficult in this form? Or is it irreducibly difficult? This relates back to Kolmogorov complexity and finding the smallest program that generates a sequence. In this case we need the fastest program. Prime TestingThe example of multiplication is easy to understand and it is easy to prove that it is in P by describing an O(n2) algorithm. So, while we might want to argue about how fast it can be done, there is no question that it is in P. Now consider an equally innocent looking problem - proving that a number is prime (that is has no factors). For smallish numbers this is a trivial job. For example, 12 isn't a prime because it can be divided by 2 and 3 to name just two candidates, but 13 is prime because it has no factors. The simplest method of proving that any number, z say, is prime is to try to divide it by 2, 3 and then all the odd numbers up to SQRT(z). If you analyze this method you will discover that it is O(2n) where n is the number of bits used to represent the number and this is exponential and so not in P i.e. prime testing using this algorithm isn't in P. But this doesn't prove that testing for primes is or isn't in P because there might be a better way of doing the job. But given the exponential complexity of factoring a number it seems very reasonable that a polynomial time algorithm for testing primality isn’t possible. However, recently the problem was solved and now we know that testing a number for primality is in P – i.e. there is a polynomial time algorithm that will tell you if a number is prime or not. This wasn’t an easy result and we still don’t know if there is an even faster algorithm. The current fastest algorithm is O((log n)6), which is sub-polynomial. You should be able to see that the problem is that if you can find an algorithm for a task that runs in polynomial time then you have proved that the problem is in P, but if you can't it might be that you just haven't looked hard enough. In this case we tackled proving primality by factoring, which was exponential time, but it turned out that there was a very non-obvious algorithm that could prove primality or not without factoring the number. It is also worth thinking for a moment about what this means for the related task of factoring a number. To prove that a number is non-prime you simply have to find a factor and searching for a factor seems to require exponential time. If the number is a prime then you have to search through all of the possible factors to prove that it doesn't have one. At first sight proving a number is a prime seems harder than proving that a number isn't a prime and yet we have just claimed that proving a number prime is in P. This is very strange as the property of being a prime is defined in terms of factors and yet somehow you can prove it without finding any factors at all. All the proof of primality gives you as far as factoring is concerned is a clear decision on how many factors there are, i.e. zero for a prime and at least two for a non-prime. It clearly tells you nothing about what the factors are and this is good from the point of view of many cryptographic procedures that rely on the fact that factoring is hard. To summarize: finding factors is exponential in time but the question "does this number have zero or more than zero factors” is answerable in polynomial time. The route that we took to find an algorithm for testing a number for primality in polynomial time is also interesting. At first we invented probabilistic tests for primality, i.e. tests that only prove that a number is prime with a given probability. This is an unusual idea that deserves explaining further. Currently the most useful of the probabilistic tests is the Rabin’s Strong Pseudoprimality test. If the number we want to test is n, assumed odd, first we have to find d and s such that n-1 = 2sd where d is also odd (this is easy to do, try it). Next we pick an x smaller than n and form the sequence: x^d, x^{2d}, x^{4d}, x^{8d} ... x^{2^{n-1}d} all calculated modulo n. Once you have the sequence of values – called a “Miller-Rabin sequence” – you can be sure that if n is a prime the sequence starts with a 1 or has a -1 before the end of the sequence. As once the sequence is a 1 or a -1 the remaining values are 1 respectively you can specify the condition as For a prime the Miller Rabin sequence is either all ones or it has a minus one followed by all ones. This sounds conclusive, but some non-prime numbers also have sequences that satisfy this condition. Note that if the sequence doesn't satisfy the condition you have a certain proof that the number isn't a prime. It is the case where it is prime that is difficult. At this point it looks like a hopeless task to say any more about the number if the sequence satisfies the condition. It might be prime or it might not. There is a proof that if n isn't prime then three quarters of the x values that you could pick don't have sequences that satisfy the condition. To be more exact, if the number isn't prime the test has a 75% probability of not starting with a 1 or containing a -1. So after one test that satisfied the condition you can conclude that there is a 25% chance of the number isn't a prime. After another test the doesn't satisfy the condition, the chance that the number isn't prime is 6.25% or, putting it the other way, the probability that it is prime is 93.75%. If you keep testing you can make the probability as close to one as you want. For example, if you get sequences that satisfy the condition with 100 random values then the probability that you have a non-prime number is (1/4)100 which is roughly 1/1061 which is exceeding unlikely and so you are safe in concluding that the number is indeed prime. Such is the strange reasoning behind randomized testing for primes, and many other randomized algorithms. They never produce certainty, but what is the practical difference between a probability of 1/1061 and certain proof? There is more chance of a machine malfunction giving you the wrong answer computing a certain proof of primality than the Strong Pseudoprimality test failing! The existence of a probability-based algorithm that was in P was an indication that there might just be a deterministic algorithm in P. This turned out to be true. Interestingly the probability-based test is still in use because in practice it is so much quicker than the deterministic test for moderate n. A great deal of computer science is only useful as theory even when it appears to be practical. Summary
A Programmers Guide To Theory
Now available as a paperback and ebook from Amazon.Contents
<ASIN:1871962439> <ASIN:1871962587> To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.
Comments
or email your comment to: comments@i-programmer.info |
||||||
Last Updated ( Saturday, 21 March 2020 ) |