Professional Hadoop
Professional Hadoop

Authors: Benoy Antony, Konstantin Boudnik, Cheryl Adams, Branky Shao, Cazen Lee and Kai Sasaki
Publisher: Wrox
Date: May 2016
Pages: 216
ISBN: 978-1119267171
Print: 111926717X
Kindle: B01F69Z1PS
Audience: Professional database developers
Rating:4
Reviewer: Kay Ewbank

This book is a good introduction to the Hadoop infrastructure if you're trying to work out how everything fits together.

This is a relatively slim volume. What is covered is well explained, but given the complexity of the Hadoop infrastructure, there isn't the space to go more deeply into the individual topics.

The book starts with an introduction to the Hadoop components, and to what HDFS, MapReduce, YARN, ZooKeeper and Hive all do. It also looks at data integration and Hadoop, all at an overview level.

Storage and HDFS in particular is covered in rather more detail in the next chapter. You're shown how to set up HDFS clusters, what the file formats are, and what the command line interface can be used for. More advanced features of HDFS such as snapshots and tiered storage are also introduced, along with some features currently under development such as Erasure Coding (currently in the alpha 2 release of Hadoop 3).

 

Banner

Computation in the form of MapReduce is the topic of the next chapter. Once the architecture of MapReduce has been introduced, you're shown how to set up a MapReduce job, and the difference between Spark and MapReduce jobs are explained.

A chapter on user experiences is next, looking at how Hive, Pig, Hue and Oozie can be used by people who aren't MapReduce experts. The chapter goes into a reasonable amount of detail on each topic.

Integration with other systems gets a chapter of its own, with explanations of the role of Sqoop,Flume, Kafka, Storm, Trident, and stream processing.

A chapter on Hadoop security then shows how you can secure Hadoop Cluster, the data in the cluster, and applications running in the cluster. This is obviously only an overview, but useful all the same, giving details from perimeter security inwards, and looking at the advantages and drawbacks of Kerberos, SASL, service level authorization, and ways of restricting access.

The Ecosystem at large is the topic of the next chapter, looking at Hadoop with Apache Bigtop. Bigtop is a project for the development of packaging of the Hadoop ecosystem, so you can package a Hadoop stack and deploy it on a variety of data platforms. The chapter describes what Bigtop is, the specifics of different open source data processing stacks, how to create your own stack including Hadoop, and how to deploy and manage a configuration. 

The final chapter looks at in-memory computing in the Hadoop stack. It looks at the performance improvement you can get if you use MapReduce with Apache Ignite, the use of HDFS caching, and advanced use of Ignite for state sharing.

I started off thinking this book was too short to be able to be really useful. However, having read it in more detail, I think it does a very good job. The Hadoop and Apache alphabet soup is a confusing mix to get to grips with, and this book does a good job of making sense of it and giving enough detail to be useful.

Hadoop is one of the topics covered in Reading Your Way Into Big Data, an article on Programmer's Bookshelf in which Ian Stirk provides a roadmap of the reading required to take you from novice to competent in areas relating to data science.

Related Reviews

Field Guide to Hadoop

Hadoop Essentials

Hadoop: The Definitive Guide (4th ed)

Hadoop for Finance Essentials

Data Analytics With Hadoop

 

Banner


Software Design Decoded: 66 Ways Experts Think

Author: Marian Petre, André Van Der Hoek and Yen Quach
Publisher: MIT Press
Pages: 184
ISBN: 978-0262035187
Print: 0262035189
Kindle: N/A
Audience: Software Designers
Rating: 3.8
Reviewer: Kay Ewbank

This book consists of sixty-six short one-page insights each putting forward an idea about how expert [ ... ]



Android Security Internals

Author: Nikolay Elenkov
Publisher: No Starch Press
Pages: 432
ISBN: 9781593275815
Print: 1593275811
Kindle: B00P8DRZWA
Audience: Competent Android developers
Rating: 5
Reviewer: Mike James
Reviewed: February 2015

Over the festive season IProgrammer gives its reviewers a well-deser [ ... ]


More Reviews

Last Updated ( Saturday, 25 February 2017 )
 
 

   
RSS feed of book reviews only
I Programmer Book Reviews
RSS feed of all content
I Programmer Book Reviews
Copyright © 2017 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.