Authors: Kristina Chodorow
Publisher: O'Reilly 2013
Aimed at: developers working with MongoDB
Reviewed by: Kay Ewbank
MongoDB is rising in popularity, recently overtaking Access in DB-Engines.com popularity rankings. This means there’s an increasing likelihood you’ll want (or need) to use it in an application at some point soon. Even if you know about relational database development, MongoDB’s document-based approach is quite different, so there’s definitely a need for a book like this.
Kristina Chodorow worked on the MongoDB core for five years before joining Google, so knows what she’s talking about. As you’d expect from this, she’s enthusiastic and knowledgeable about MongoDB, and this shows throughout the book.
The book is split into six parts, and starts from the assumption that you know nothing about MongoDB. Chodorow starts with a good introduction to MongoDB – why it was created, the goals it is trying to accomplish, and why you might want to use it for a project. Given how different MongoDB is to a traditional database, this is a much needed chapter. Next, the book goes on to the core concepts, the database and the shell. The final two chapters in this part of the book look at the basics of working with MongoDB – reading and writing documents, and creating queries.
Part III of the book is dedicated to replication, starting from setting up a replica set, through the components – syncing, connecting to a replica set from your application, and replica set administration. Part four does a similar job for sharding, with chapters introducing sharding, then on configuring your shards, how to choose a shard key, and sharding administration. The chapter on choosing the right shard key was particularly interesting. The shard key is a field (or fields) that you use to split up the data across your shards. As Chodorow points out, once you have more than a few shards it’s almost impossible to change your shard key, so you need to get it right first time. This chapter is not one to skim read; it could make the difference between a database that flies and one that crawls.
Application Administration is the topic of Part IV of the book. This is a separate topic to server administration (which is covered next). Instead, by application administration Chodorow means being able to answer questions such as ‘what queries is MongoDB running?’; ‘how much data is being written? ‘ and ‘what is MongoDB actually doing?’. The first chapter in this section shows you how to see the current operations, how to kill problematic operations, and how to use the system profiler. There’s a chapter on data administration that covers some admin topics – setting up authentication, creating and deleting indexes, along with a discussion of preheating data. If you restart (or start) a server, MongoDB won’t necessarily have the most relevant data in memory from disk, and this section looks at how you can get your data into RAM before bringing the server officially online. Journaling is the subject of the final chapter in this section. MongoDB’s journal is a list of the exact disk location and bytes changed when you perform writes. It contains around 60 seconds of write data, as the data files are flushed to disk every 60 seconds. Chodorow discusses setting up and using the journal for durability, and the choices you make and their tradeoffs .
The final section of the book covers the more usual server administration tasks – starting and stopping MongoDB, monitoring, making backups, and deploying it in production.
I was impressed by this book. Chodorow knows her stuff, and equally importantly can explain it well. If you don’t know anything about MongoDB you could read this and make a decent attempt at development. If you already know a bit about it, some of the more technical sections should improve your understanding and the performance of your database. A good read.