Apache Spark MapR Connector Provides JSON Support
Written by Kay Ewbank   
Monday, 05 June 2017

There's a new Native Spark Connector for MapR-DB JSON that gives developers APIs to access MapR-DB JSON documents from Apache Spark, using the Open JSON Application Interface (OJAI) API.

Apache Spark is an open source big data processing framework, which is used for analytics on streaming and batch workloads. MapR-DB is a high performance NoSQL database, which supports two primary data models: JSON documents and wide column tables. A Spark connector is available for each data model. With the Spark/MapR-DB connectors, you can use MapR-DB as a data source and as a data destination for Spark jobs.

The Native Spark Connector for MapR-DB JSON supports loading data from a MapR-DB table as a Spark Resilient Distributed Dataset (RDD) of OJAI documents and saving a Spark RDD into a MapR-DB JSON table. (An RDD is the base format for storing data for use by Spark.)

native connector batch image

The connector includes a set of APIs that that enable MapR users to write applications that consume MapR-DB JSON tables and use them in Spark. It is is a companion to the MapR-DB Binary Connector for Apache Spark, which can be used to write applications that consume HBase binary tables and use them in Spark.

The connector has two APIs that let you load data from a MapR-DB JSON table to a Spark RDD or save a Spark RDD to a MapR-DB JSON table. It also provides support for Scala bean classes, has a custom partitioner that allows you to partition data for better performance, and supports data locality. When the connector reads data from MapR-DB, it uses the data locality feature of MapR-DB to spawn the Spark executors.

The Native Spark Connector includes support for data frames and dataset APIs, so HBase and MapR-DB binary tables can be queried directly with Spark. The advantage this offers is that it removes any intermediary layers, making it easier to construct faster data pipelines and reduce latency associated with data movement.

mapr

More Information

MapR-DB OJAI Documentation

Related Articles

Apache Spark 2.0 Released

Apache Spark Technical Preview

Spark Announcements

Apache Releases Spark 1.6

Spark 1.4 Released

MOOC On Apache Spark 

Learning Spark (book review) 

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

 

Banner


Personal Picks For Holiday Gifts
29/11/2024

It's Black Friday, the traditional day to indulge in online shopping. Not every item that is included in my selection of gifts is subject to a promotional offer, but where they aren't you might be poi [ ... ]



Use Javascriptmas To Hone Your Webdev Skills
08/12/2024

Every day until December 24th MDN, in partnership with Scrimba, is releasing a daily challenge, which as the name suggests requires you to practice your JavaScript skills. Each solution you submi [ ... ]


More News

 

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Monday, 05 June 2017 )