|Foundations of Deep Reinforcement Learning|
Authors: Laura Graesser and Wah Loon Keng
This book is excellent and if you just want to cut to the conclusion: if you have any interest in reinforcement learning just buy it, read it and learn.
Reinforcement learning is a simple idea - give the system a reward when it does well and let it adjust its behavior to maximize the reward. However, if you start looking into it then things get surprisingly mathematical very quickly. The reason is that the models of reinforcement learning that we use are very mathematical. You need to know about hidden Markov processes and hence a fair amount of probability and statistics. Then there are the Bellman equations, not to mention back propagation and all of the complex implementation details of design and training of a neural network.
What is nice about this book is that the authors take you through much of the mathematics at a level that should be followable even if your math isn't genius level - although it still has to be reasonable. However, rather than leaving you to figure out why one equation follows from another, there is an explanation of why you can derive one from another. Be clear I'm not saying that this makes the math trivial, but it certainly helps and it isn't something you will find in other books on such a technical subject.
The book starts off with an overview of the reinforcement learning model the Markov Decision Process and how to approach it. My only comment is that it would be nice to have the Bellman equations included and explained at the same level.
Part I of the book is called Policy-Based and Value-Based algorithms and it explains the standard algorithms in this area - REINFORCE, SARSA and DQN. For most readers it is the DQN (Deep Q-Networks) that are going to be of most interest as this is what Google's Deep Mind uses to play games. The section closes with a look at how to use a DQN to play Atari games.
Part II is called Combined Methods and it is something of a blend of old and new. We start with Actor-Critic and move on to A2C. Next we look at Proximal Policy Optimization and finally how to speed things up using parallel processing.
Part III is called Practical details - not that practice has been ignored up to this point. If you are a programmer then this will be of less interest to you because it is telling you things you should already know - unit tests, code quality Git and so on. There is specific advice on testing your algorithms. It also covers using the SLM lab to trying things out and what neural network architectures to use. The final part considers what hardware you should use.
The final section is called Environment Design, i.e. how to recognize the components of the model in the real world, and this is a tough area. In practice getting a reinforcement learning system off the ground is as much to do with identifying what your inputs and outputs are. How you set up the states, actions and rewards can seem like an impossible task in many cases.
If you want to master reinforcement learning then this is the book you need. It is a guide to the theory and the practice. It isn't an easy book to read and it will take you some time to actually read very much of it. This is because the subject matter is difficult and the book does its best to explain and motivate it. Even so, as already stated, you are going to have to be happy reading quite complicated equations and understanding them. The only way that this could have been avoided is by not telling you how things work and this is not the intention of this book.
Excellent - if you are in, or want to be in, the field read it.
To keep up with our coverage of books for programmers, follow @bookwatchiprog on Twitter or subscribe to I Programmer's Books RSS feed for each day's new addition to Book Watch and for new reviews.
|Last Updated ( Wednesday, 04 March 2020 )|