Algorithms of the Intelligent Web
Author: Haralambos Marmanis & Dmitry Baenko

Publisher: Manning, 2009
Pages: 368
ISBN: 978-1933988665
Aimed at: Java web developers
Rating: 5
Pros: Excellent and practical coverage of AI for the
web
Cons: Benefits from prior knowledge of statistics
Reviewed by: Mike James

If you are looking for an introduction to the way that some basic AI techniques can be applied to the web then this is a good place to start. It provides good accounts of searching with Lucene, similarity measures, clustering and classification. The methods covered in detail include the calculation of pagerank, k-means clustering, Bayesian classification, neural networks and rule based programming. Even if you know the basics of any of these you will probably learn something because the book presses on into slightly more specilised and practical issues such as combining classifiers, bagging and boosting and even explains significance testing.

All of the techniques are introduced with reference to how you might use them on the web complete with examples, all in Java, showing how to datamine Digg and NetFlix. The final chapter puts it all together with an example of building an intelligent news portal. This is a simple example but it begins to give you some idea of how you could add intelligence to a website.

The authors use whatever libraries are needed to implement the ideas without the need to reinvent the wheel - Lucene, Drools. DBSCAN etc.. This allows the book to present solutions to problems in a smaller number of lines of code and in practice you would actually want to use such libraries in your own code. In addition there are lots of references to more academic works, further reading and ways of implementing things when you want to go beyond a "small" example.

The only negative effect of this practical introduction to the ideas is that sometimes you have to work hard to keep the principles in view while you tackle the specifics of the subject matter. To get the best from the book it would probably help to have some knowledge of the basic statistical ideas that lie behind the techniques, but this isn't essential if you are prepared to put the work in. If you do have a theoretical background then prepare to be amazed that your theoretical knowledge can actually have practical value!

If you are interested in bringing some intelligence to the web then this is the book to read. Highly recommended.

 

Last Updated ( Saturday, 29 August 2009 )