Google Open Sources C/C++ MapReduce Framework
Written by Kay Ewbank   
Friday, 06 March 2015

MapReduce Framework for C (MR4C) will let you run native code in Hadoop, allowing you to use image processing libraries developed in C and C++ on data held in Hadoop.

hadmapreducebanner

 

The framework was originally developed at Skybox Imaging, a satellite imagery company that Google acquired in June 2014, for the purpose of large scale satellite image processing and geospatial data science. There are a number of proprietary systems that execute native code in MapReduce frameworks, but MR4C is designed to be more flexible and as it is open source, can be freely used and further developed.

According to the blog post by Ty Kennedy-Bowdoin  about the release, MR4C has a few simple concepts that make it easier to move your native code to Hadoop. Algorithms are stored in native shared objects that access data from the local filesystem or any uniform resource identifier (URI), while input/output datasets, runtime parameters, and any external libraries are configured using JavaScript Object Notation (JSON) files. Splitting mappers and allocating resources can be configured with Hadoop YARN based tools or at the cluster level for MRv1.

mr4c

 

You can also string workflows of multiple algorithms together using an automatically generated configuration. There are callbacks in place for logging and progress reporting, and the reports can be viewed using the Hadoop JobTracker interface. Your workflow can be built and tested on a local machine using exactly the same interface employed on the target cluster.

The blog post says that the goal of this project is to abstract the important details of the MapReduce framework and allow users to focus on developing valuable algorithms.

There’s more information on the MR4C github page: https://github.com/google/mr4c

mapredsq

More Information

MR4C on GitHub

MapReduce for C: Run Native Code in Hadoop

Skybox Imaging

Related Articles

Google Moves On From MapReduce, Launches Cloud Dataflow 

Agile Data Science (book review) 

Big Data Analytics (book review) 

 

To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, FacebookGoogle+ or Linkedin,  or sign up for our weekly newsletter.

 

Banner


Apache Releases Tomcat 11
07/11/2024

Apache has announced the release of Tomcat 11, as well as marking the 25th anniversary of the first commit to the Apache Tomcat source code repository since becoming an ASF project.



OpenAI Library For .NET Exits Beta
19/11/2024

A few months ago the OpenAI .NET library was released as a beta. It has now reached version 2.0.0 and the time has come to leave beta and, with a few amendments enter production readiness.


More News

 

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Friday, 06 March 2015 )