Programming Skills for Data Science
Written by Mike James   

Authors: Michael Freeman and Joel Ross
Publisher: Addison-Wesley
Date: December 2018
Pages: 384
ISBN: 978-0135133101
Print: 0135133106
Kindle: B07KMDCHT2
Audience: Would-be data scientists
Rating: 5
Reviewer: Mike James
If you are looking for a programmer's guide to R, this might be it.

As I've said before, the big problem with writing a book on R is whether it should concentrate on the programming language or the statistical procedures via the use of the language. This particular book is more toward the programming language with some simple statistical procedures - mostly with graphics acting as examples.

It starts off, Part I Chapter 1,  with setting up your computer and very sensibly covers IDEs  including RStudio, which is the obvious one to use. Don't try using just a text editor to program in R - it will cost you a lot of time unless you are already an expert and even then a good IDE will save you from mistakes. It also covers setting up GitHub which plays a moderately central role in the rest of the book. Collaboration is common in statistical work but it still isn't clear to me that Git or GitHub is a key component - it certainly makes things more complicated at first.

Chapter 2 shows you how to use R from the command line, including navigating the file system. Useful stuff, but I think it should be in an appendix.

Part II is about using Git but mainly via GitHub to manage your projects. If you don't plan using GitHub skip on to Part III. Chapter 4 is also about markdown as a way of creating simple documentation, another useful skill.

 

Banner

Part III gets to grips with R. It goes though the basics of variables, functions, conditionals and lists. Personally I think it should cover Data Frames as the ultimate R data structure, but this is postponed until Part IV. All of the descriptions are good and easy to read. There is a lot of intelligent writing in this part of the book - in fact there is a lot of intelligent writing in most of the book. This isn't a dummies book and you need to read it carefully.

Part IV is moving towards statistics but it is still mostly about using the R language to manipulate data. After a brief look at the generalities of data the book moves on to Data Frames. Then on to manipulating data mostly using the dplyr and the tidyr functions. Chapter 13 is a short introduction to accessing a SQL database. Chapter 14 covers REST and accessing web data including JSON.

Part V is much more about stats but only simple graphs and charts. Here you learn to plot with ggplot2, plotly, rbokeh and leaflet. Part VI returns to programming aspects of using R. Chapter 18 deals with dynamic reports using markdown, 19 is about websites using Shiny and 20 returns to the idea of using GitHub for collaboration. The final chapter provides some guidance on learning statistics, other language and so on.

This book will not teach you much about statistics apart from some very basic ideas about data. I will teach you quite a lot about R. For my tastes not quite enough about R but it does a better job than other books I have reviewed. The writing style is, as I said earlier "intelligent". There are plenty of comments and asides to set the scene and it is all easy to read.

Highly recommended as an introduction to R and the programming practices that surround it. You will still need to teach yourself statistics, but that is another, and much bigger, problem.

 

To keep up with our coverage of books for programmers, follow @bookwatchiprog on Twitter or subscribe to I Programmer's Books RSS feed for each day's new addition to Book Watch and for new reviews.

Banner


C++ Programming, 6th Ed (In Easy Steps)

Author: Mike McGrath
Publisher: In Easy Steps
Date: April 2022
Pages: 192
ISBN: 978-1840789713
Print: 1840789719
Kindle: B09V2T9SJD
Audience: Developers wanting to learn C++
Reviewer: Mike James
This is the 6th edition of a slim book on C++. Can you really learn C++ in easy steps?



Practices of the Python Pro

Author: Dane Hillard
Publisher: Manning
Date: January 2020
Pages: 248
ISBN: 978-1617296086
Print: 1617296082
Audience: Python developers
Rating: 3
Reviewer: Mike James
I want to be a Python Pro....


More Reviews

Last Updated ( Tuesday, 30 July 2019 )