R For Everyone, 2nd Ed

Author: Jared P. Lander
Publisher: Addison Wesley
Date: June 2017
Pages: 560
ISBN: 978-0134546926
Print: 013454692X
Kindle: B071X9KT1D
Audience: Would-be R programmers
Rating: 4.5
Reviewer: Mike James

R really isn't for everyone, but you know what the book title means really.

I've probably said this before, R is a difficult language to write a book about. The problem is do you target the statistics or the programming? Quite a few R books that I've looked at are about what stats you can do using R, and they don't really help because they don't teach you the language and they aren't very good at statistical theory. This particular book, on the other hand, is quite well balanced. It doesn't take the programmer's view or the statistician's view. Instead it tries for a nice balance between the two and it mostly works.

 

Banner

 

Before I carry on saying mostly nice things I need to get one grumble over and done with. This is a beautiful book - its in color. At first sight it really does look impressive, but when you actually try to read it then things are less than perfect. The problem is that some parts of the text are printed in a light grey. All you need is a sunny day and you can't read as easily as you might otherwise with a good black ink. I'm all in favour of design, but not when it impacts readability to this extent. This isn't a deal breaker and overall I still recommend the book but .. it could have been so much easier to read. 

 

 

 


Chapter 1 is about getting and starting R. Chapter 2 deals with using the R environment - RStudio and Visual Studio.  Chapter 3 is about using packages and this is important because the book shows no tendency to stick to the main stream of R. It imports modules to do things right from the start.

Chapter 4 is where we get started with R proper. By the end we have encountered the vector, function, missing data and pipes. This is where you first discover that the author has a tendency to introduce something and then, just when you think he isn't going to, explains it. Just have the courage to keep reading you will be rewarded. The coverage isn't particularly programmer-oriented but enough is explained for you to get the ideas.

Chapters 5 and 6 are about data. First we learn about data frames, lists, matrices and arrays and then on to how to read data from files, csv, excel, database, json and so on.

Chapter 7 is the first on stats, but on simple charts using ggplot2. 

Chapter 8 is back to R and explains how to write a function. Personally I would have put this before charts. Then, in Chapter 9, on to control statements: if-else and the more functional ifelse.

Chapter 10 is on loops and has the very accurate title "Loops, the Un-R Way to Iterate". In R, ideally, you make use of vector operations that make loops unnecessary, but however hard you try sooner or later you will need an explicit loop.

 

 

The next four chapters are all about data manipulations. first using applu and then using two imports, dplyr and purr, for iterating. The final chapter of the set on data is about strings and how to work with them, including regular expression. 

Chapter 17 is where the stats part of the book begins -with distributions. Following this we have  sequence of chapters on some fairly advanced stats. Chapter 18 may start off with summary statistics, but it quite quickly gets to t-tests and anova. The anova theme continues with linear models and generalized linear models and model diagnostics. 

Chapter 22 is on regularization and shrinkages - which is advanced stuff. Chapter 23 is on non-linear models including decision trees and random forests. Chapter 24 is about time series and Chapter 25 is about clustering. 

This brings the stats part of the book to a close and there are lots of statistical techniques not included - principle components, discriminant analysis, canonical correlation and so on. It also doesn't cover the latest craze of neural networks, but then R isn't necessarily the best choice for this.

The remainder of the book deals with what you might call important side tasks. Creating reports with knitr, using RMarkdown, interactive dashboards with shiny and creating R packages.

Conclusion

This isn't the best possible book on R, but it is good enough for you to think about adding to your collection. It is a good book for the R beginner, but be warned there are some advanced topics when it comes to the stats. Overall the stats isn't explained as theory, but as practice. If you hope to use any of them, and reason about them, you will need to read another book. The chapters on data manipulation are particularly useful.

Recommended to R beginners who know enough stats not to worry about the advanced material.

To keep up with our coverage of books for programmers, follow @bookwatchiprog on Twitter or subscribe to I Programmer's Books RSS feed for each day's new addition to Book Watch and for new reviews.

Banner


Python Programming and Visualization for Scientists 2nd Ed

Author: Alex DeCaria and Grant Petty
Publisher: Sundog Publishing
Pages: 372
ISBN: 978-0972903356
Print: 0972903356
Audience: Scientists wanting to use Python
Rating: 2
Reviewer: Mike James
Visualization - a difficult topic and difficult to see how to explain the ideas in a book.



Machines Like Me

Author: Ian McEwan
Publisher: Vintage, 2019
Pages: 304
ISBN: 978-1529111255
Print: 1529111250
Kindle: B07HR6SGQ9
Audience: General
Rating: 4.5
Reviewer: Mike James
A novel about a synthetic human has become so much more relevant recently and guess what - it features Alan Turing.


More Reviews

Last Updated ( Saturday, 28 April 2018 )