Python Machine Learning, 2nd Ed

Author: Sebastian Raschka and Vahid Mirjalili
Publisher: Packt
Pages: 622
ISBN: 978-1787125933
Print: 1787125939
Kindle: B0742K7HYF
Audience: Python developers interested in machine learning
Rating: 4.8
Reviewer: Mike James
Python and machine learning are made for each other.

Machine learning is important now and can only become more important in the future. Python is a language that is much used in data processing and scientific computing and so it is a natural for the subject of machine learning. This book introduces the ideas in a practical way using all of the standard libraries that have accumulated around Python. This second edition also has a much expanded coverage of Tensorflow, which seems to have become the most used package for this sort of computation.

This book introduces machine learning in the broad sense. Many of the techniques would have been called statistics not so long ago. It does cover neural networks and deep learning, but this isn't the only topic. If you are looking for something that focuses on deep learning then you probably need a different book.

The book assumes that you know Python and don't need any explanations of how to get a program up and running. It presents a lot of code and practical examples. You can download the code and try things out. The technical level isn't that high, but be warned there are lots of equations. This isn't a deep theory book but it isn't for the complete beginner either. The ideas are explained reasonably well and as long as you have some idea about math and programming you will get something out of it.

Banner

After the usual introductory chapter on what machine learning is and setting up the Python packages you need, the book moves on to look at the first machine learning technique - the Perceptron. You get to implement one in Python and its near neighbour, but much less well known, Adaline. It is nice to see classic datasets being used - Fisher's Iris data must have been used to teach a lot of machine learning practitioners over the years!

Chapter 3 uses scikit-learn to investigate some classical techniques - logistic regression, SVM and decision trees. Chapter 4 is about working with data - always the most difficult and specific aspect of any project. Oddly at the end of the chapter we have a discussion of L1 and L3 regularization and random forests - surely these should be in another chapter?

Chapter 5 is another classical set of techniques that so many machine learning books ignore. Dimension reduction is important but the techniques are far less well known than more recent approaches such as the auto-encoder. After dealing with standard Principle Components Analysis (PCA) we have an account of Linear Discriminant Analysis (LDA) and finally kernel PCA to account for non-linearities. LDA in particular is almost a forgotten approach, but it is a powerful technique that can give you insights into your data. There is no mention of multidimensional scaling as a dimension reduction method, but this is less important.

From here we move onto model evaluation and how to use cross validation and the various forms of performance measurement. Chapter 7 introduces the interesting idea that two or more classifiers are better than one - ensemble learning. Chapter 8 is a sort of case study on using machine learning for sentiment analysis and Chapter 9 converts this into a web application using PythonAnywhere.

 

Chapter 10 gets back to the main subject with a closer look at linear regression. The chapter goes into regularized regression, ridge and lasso. It also covers polynomial regression, but not step-wise regression. For some strange reason this most useful technique hardly ever seems to be covered in machine learning books. What is more there is also a step-wise version of discriminant analysis that is also ignored even though it is very useful in feature selection problems.

Chapter 11 is a basic introduction to cluster analysis. It's not very complete, but enough for you to decide if you might need to use it. If you do then my recommendation for the best book on this topic is still Cluster Analysis 5th Edition by Brian S. Everitt, et al.

 (click cover to purchase from Packt)

 

To keep up with our coverage of books for programmers, follow @bookwatchiprog on Twitter or subscribe to I Programmer's Books RSS feed for each day's new addition to Book Watch and for new reviews.

 

 


TinyML: Machine Learning with TensorFlow Lite

Authors: Pete Warden and Daniel Situnayake
Publisher: O'Reilly
Date: December 2019
Pages: 504
ISBN: 978-1492052043
Print: 1492052043
Kindle: B082TY3SX7
Audience: Developers interested in machine learning
Rating: 5, but see reservations
Reviewer: Harry Fairhead
Can such small machines really do ML?



SQL Server 2022 Administration Inside Out

Author: Randolph West et al
Publisher: Microsoft Press
Pages: 992
Print: 0137899882
ISBN: 978-0137899883
Kindle: B0C4VKVP27
Audience: DBAs and developers
Rating: 5.0
Reviewer: Ian Stirk

This book aims to update your DBA skills to cover SQL Server 2022, how does it fare?


More Reviews

<ASIN:0470749911>

Last Updated ( Saturday, 12 May 2018 )