Machine Learning in Python
Machine Learning in Python

Author:  Michael Bowles 
Publisher: Wiley
Date: May 15, 2015
Pages: 360
ISBN: 978-1118961742
Print: 1118961749
Kindle: B00VOY1I98
Audience: Python programmers with data to analyze
Rating: 3.5
Reviewer:  Mike James 

Python is a good language to use to implement machine learning, but what exactly is machine learning? 

I was quite surprised by the selection of topics covered by this book. Not so long ago techniques such as regression, stepwise regression and even ridge regression would have been the subject of a book on statistics. A book with a title like "machine learning" would have started off with the perceptron algorithm, Bayes theorem and worked its way though a collection of methods that mostly didn't have much statistical basis. Machine learning used to be dynamic and mostly heuristic based. Now, with the advent of "big data" and "data science" you can write a book on machine learning without much in the way of statistics or AI and the subtitle of Michael Bowles' book reveals that his focus is "Essential Techniques for Predictive Analysis".

This is a book that takes a static data set and proceeds to find ways to either predict or classify some dependent variable. It isn't quite AI and it isn't quite statistics. 

Of course this isn't a problem if you are working in some sort of analytical capacity and want new way of dealing with data. Readers who want a mainstream machine learning book will be disappointed - there are no perceptrons, neural networks or Support Vector Machines. If you are looking for a book on statistical data analysis then again you need to look elsewhere. There are no discussions of significance, confidence intervals, principle components or anything similar. 

In fact there isn't a lot of theory in this book at all. The few equations are there simply to make the model or method clear. A lot of readers will welcome this, but machine learning is a mathematical pursuit and so you probably need to master some of the deeper ideas to do a good job. 

 

Banner
 

 

OK, what is the book about then? 

Chapter 1 introduces the two main ideas in the book: penalized or constrained regression and ensemble methods. This is a little strange because ensemble methods aren't really prediction/classification methods but ways to improve other prediction/classification methods. This doesn't really matter too much because both ideas are worth knowing about. Overall, however, this choice makes the subject matter a little narrow. 

Chapter 2 looks at basic data exploration and we encounter the usual ideas of graphical displays to help you get a feeling for your data. Here things are explained quite well but there is still a lot of "magic" going on. Why take a log transformation of the data? What exactly are we trying to do? 

Chapter 3 moves on to predictive model building. This introduces some of the problems of modeling mostly overfitting which is a theme throughout the book. The main topic is regression modelling and forward stepwise regression is introduced as a way of finding a parsimonious model. There is no mention of backward or full stepwise regression. 

Chapter 4 introduces penalized or constrained regression as a way of avoiding overfitting and as an alternative to stepwise regression. If you read the explanations carefully you might discover why adding a constraint to the usual least squares fit makes the algorithm find solutions with sparse parameter vectors but it isn't as clear as it could be. We also learn about some of the special algorithms invented to speed up constrained regression - LARS and Glmnet. Chapter 5 applies some of the ideas to sample data sets. 

 

macchinelinpython

 

Chapter 6 moves on to the second major topic of the book - ensemble methods. The idea that using more than one imperfect rule gives an improved performance is introduced, but it isn't really explained. In particular, the important fact that the ensemble of models need to be independent is mentioned, but this isn't discussed or emphasises enough. What is more, exactly how the models that are explained are constructed to be independent it underplayed. As part of the ensemble idea the binary decision tree is introduced as an easy way to get a set of complex independent models. The ensemble methods described include bagging, gradient boosting and random forests. Chapter 7 applies the ideas to some data sets. 

One feature of the book is that it is full of lots of Python code to compute the different methods. You might regard this as good but it doesn't make use of libraries such as numpy and there is a lot or repeated code. Arguably this is not the way to implement real data analysis routines and the value of the code is restricted to making sure that you understand how things work. On the other hand, relating the code to the theory isn't that easy.

If you want a practical introduction to penalized regression, binary decision trees, random forests and so on then this might give you a starting point. However you will need to read something with a more theoretical approach if you are to make any progress in the field. 

 

For an alternative title see Mike James review of Machine Learning in Action which also uses Python.

 

For more books on Python see Books for Pythonistas in Programmers Bookshelf 

Banner
 


Practical OpenCV

Author: Samarth Brahmbhatt
Publisher: Apress
Pages: 244
ISBN: 978-1430260790
Audience: Devs interested in computer vision
Rating: 2
Reviewer: Mike James

OpenCV has a reputation for being difficult so any help a book can offer should be welcome. 



Robot Programming

Author: Cameron and Tracey Hughes
Publisher: Que
Pages:400
ISBN: 978-0789755001
Print: 0789755009
Kindle: B01F06BBM4
Audience: Those interested in robots, preferably with access to one to try out ideas
Rating:  2
Reviewer: Harry Fairhead

Robots are the way of the future so any  [ ... ]


More Reviews

 

 

Last Updated ( Monday, 02 November 2015 )
 
 

   
RSS feed of book reviews only
I Programmer Book Reviews
RSS feed of all content
I Programmer Book Reviews
Copyright © 2017 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.