Introduction To The Genetic Algorithm
Written by Mike James   
Thursday, 18 February 2021
Article Index
Introduction To The Genetic Algorithm
Fitness
Some Examples
Learning to walk

Genetic algorithms pop up all over computer science and applied computing. They are simple, easy to apply and easy to understand. What mystery remains is why they work at all? How can something seemingly so random home in on a one in a million plus solution? 

 

Banner

 

The idea behind the “Genetic Algorithm” approach to improving an existing solution to a problem is to allow them to show how good modifications to the solution are by making them compete.

The good news is that the basic ideas and methods aren’t difficult to understand.

The bad news is that even after you understand them you will probably still be wondering why they work at all!

Evidence that the genetic algorithm proper does work is all around you. Just look at how excellently animals fit their ecological niches. Over millions of years the process of genetic selection produces organisms that there exploit their environment almost to perfection. 

It is the desire to turn this natural design principle to our own projects that motivates the sometime strange seeming procedures of the artificial genetic algorithm. 

In short the artificial genetic algorithm tries to implement what you see around in nature as gene pools change under evolutionary pressure by sexual reproduction and the action of the survival of the fittest. 

Genes

The basic idea of evolution, and hence the genetic algorithm, is very simple.

First you start off with a population of “units”. The units could be almost anything you care to think of – humans, programs, computers, electronic circuits, solutions to maths problems, etc..

The next requirement is some measure of “quality” of the units.

It has to be possible for you to rate each member of the population according to its “fitness”, i.e. quality. The quality function usually isn’t a problem because, given you are looking for a procedure to make things better, you usually have some idea of what “better” means. You might have difficulty in expressing this "better" in a simple numerical way but you can generally find something that at least correlates with "better". Given you are going to have to evaluate a complete population at each generation you also would like the measure of fitness to be quick to calculate. 

The next requirement is usually a much bigger problem.

You have to give the units something that corresponds to a gene. The artificial gene has to have a coding that controls the make up of the unit in a natural way. This sounds easy enough because anything that specifies the unit could be considered to be a gene.

For example, if you have a population of electronic circuits then the circuit diagram specifies a unit precisely and naturally. Unfortunately in this case the word “naturally” is the problem.

We need a gene representation that can be used for reproduction and this is more difficult and in many ways often seems less “natural” than the obvious representation. To make sense of this we need to look at what reproduction is all about.

Reproduction

To implement the genetic algorithm we need the units in our population to breed.

Obviously if we simply allow them to reproduce we end up with a growing population with no novel units - i.e. they are all the same.

The solution to this is to make the units reproduce sexually. That is two units are selected and their genetic material is mixed in some way to create a new unit which isn’t an identical copy of either parent.

Breeding is used to create a new population and in most cases the units in the original population are discarded – i.e. they die after breeding.

If you design an algorithm based on the previously described steps then the result is just a population which changes randomly at each generation.

To introduce the benefits of evolution to the process you have to apply a “survival of the fittest” law. In the natural environment organisms that are better at utilising it tend to survive and organisms that survive tend to breed more successfully.

This means that genes that code for good survival tend to be passed on and as a result the population gets better at utilising the environment.

In the case of the genetic algorithm all we have to do is assign a probability of breeding that is proportional to the measure of fitness we have. With this modification the population tends to improve on average with each generation and if we wait long enough eventually we will succeed in breeding a better circuit, program or whatever.

To summarise:

To implement a genetic algorithm we need

  1. A population of units with a suitable fitness function.
  2. A breeding mechanism that involves the exchange of genes.
  3. The probability of breeding has to be proportional to fitness of the unit.

Mutation?

Some accounts of the genetic algorithm also include “mutation” but this is of dubious advantage in most practical problems.

Mutation is simply the occasional random change to a gene so that the process of improvement has a chance to get “unstuck”. The idea is that a steady improvement in some aspect of the system may miss a completely new and novel approach.

A bit like fishes getting better and better at breathing water and then suddenly a mutation occurs by chance alone which produces a fish that breathes air.

The problem with this idea is that the vast majority of mutations are completely useless and are more likely to produce a fish that can’t breath at all than something better.

In the case of nature, where there are huge resources to be wasted and a huge amount of time to devote to the problem, mutation is occasionally useful. In practical applications of the genetic algorithm, where time is always short, mutation tends to just be a nuisance factor.

On the other hand mutation can help a population that is stuck in a suboptimal local maximum escape and go on to find a better local maximum or even the global maximum fitness. In the case of the fish the genetic algorithm would tend to carry on refining its solution produces better and better fishes but without mutation the chance of leaving the sea and finding even bigger rewards on the land would be low. 

One possible solution to the disruptive effects of mutation is to start the simulation off with a high mutation rate and gradually reduce it as the population converges on a single optimal solution. This is essentially the application of simulated annealing to the genetic algorithm

A simple example

All this sounds very good, perhaps even very exciting, but there is still an air of mystery about it all.

In particular what is a gene in the case of a general genetic algorithm?

At first you might think that creating a genetic specification is going to be hard but it actually isn’t quite as difficult as it sounds.

To see how it all works let’s take a look at a simple but unlikely example and apply the genetic algorithm to the problem of finding the maximum of x squared in the integer range 0 to 64.

There are lots of more direct ways of solving this problem than the genetic algorithm and no doubt readers can easily work out that the maximum is at x=64 without any help from calculus.

To solve this simple problem using the genetic algorithm is very instructive, however, and demystifies a lot of the procedures.



Last Updated ( Thursday, 18 February 2021 )