Data Smart (Wiley)

Author: Jordan Goldmeier
Publisher: Wiley
Date: November 2023
Pages: 448
ISBN: 978-1119931386
Print: 111993138X
Kindle: B0CJPP7XZM
Audience: Excel users
Level: Introductory
Category: Data Science
Rating: 4.5
Reviewer: Kay Ewbank

This is an updated edition of a well regarded title which looks at accessible ways to combine statistics and machine learning, along with Excel, to discover insights in your data, 

It has been revised by Jordan Goldmeier who wasn't the original author and is a self-confessed Excel lover who's also a Microsoft MVP. 

Banner

The book kicks off with a chapter titled 'everything you ever needed to know about spreadsheets but were too afraid to ask', in which Goldmeier introduces Excel tables and lookup formulas, pivot tables and array formulas. 

He then goes on to look at Power Query, Microsoft's data transformation and data preparation engine. The chapter considers how to use Power Query's  graphical interface to retrieve data, and the editor for applying transformations, and carrying out the extract, transform, and load (ETL) processing of data.

Chapter three has the light-hearted title "Native Bayes and the Incredible Lightness of Being an Idiot." Goldmeier starts with what he says is the world's fastes intro to probability theory before going on to consider the chain rule, Bayes rule, and how to use Bayes to create an AI model. 

Two chapters on cluster analysis are next, starting with a look at using K-Means to segment your customer base, then going on to network graphs and community detection. 

Goldmeier then looks at regression, which he describes as the granddaddy of supervised artificial intelligence. The concepts are explained well, and the examples are carefully chosen to make the ideas clear.

Next comes a chapter on ensemble models that Goldmeier describes as a whole lot of bad pizza. By this he's referring to an episode of the US version of the sitcom The Office when the boss asks whether its better to have a small amount of really good pizza or a lot of really bad pizza. He then goes on to extrapolate, saying many AI implementations are closer to the 'lots of bad pizza' model. 

A chapter on forecasting starts from the premise that there's no point worrying because you can't win, and Goldmeier backs up his assertion with a statement saying that the only guarantee in forecasting is that your forecast is wrong. He then goes on to say this doesn't mean you shouldn't try forecasting and that you'll still end up knowing more than nothing. 

Chapters on optimization modeling and outlier detection consider whether these techniques could be described as data science. 

Goldmeier then looks at how to go beyond spreadsheets with a chapter on R.

Overall, this is a good introduction to data analysis using straightforward tools and mainstream techniques. I suspect most developers would find it more useful to use R and go further, but the book could help you get started with data analysis. Worth reading. 

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


SQL Query Design Patterns and Best Practices

Author: Steve Hughes et al
Publisher: Packt Publishing
Pages: 270
ISBN: 978-1837633289
Print: 1837633282
Kindle: B0BWRD7HQ7
Audience: Query writers
Rating: 2.5
Reviewer: Ian Stirk

This book aims to improve your SQL queries using design patterns, how does it fare? 



Machine Learning For Dummies, 2e (Wiley)

Author: John Paul Mueller
Publisher: For Dummies
Date: January 2021
Pages: 464
ISBN: 978-1119724018
Print: 1119724015
Kindle: B08SZHJGJW
Audience: General, but not too dumb
Rating: 4
Reviewer: Mike James
Dummies probably need machine learning to cope...


More Reviews