Google Launches Cloud Dataproc
Written by Kay Ewbank   
Friday, 02 October 2015

Google has launched a beta version of Google Cloud Dataproc, a service which will provide an alternative way to manage Hadoop and Spark more quickly and easily.

Google continues to expand its range of cloud services for working with Big Data, see Google Announces Big Data the Cloud Way. Now available in beta, Cloud Dataproc is a managed Spark and Hadoop service that lets you use open source data tools for batch processing, querying, streaming, and machine learning. The aim is to let you you create clusters quickly, manage them easily, and save money by turning clusters off when you don't need them.

dataproc

The service can be used from three clusters up to hundreds of clusters, and is priced at 1 cent per virtual CPU in your cluster per hour on top of the usual cost of running virtual machines and data storage. The clusters can include preemptible instances that have lower compute prices, and you’re charged using minute-by-minute billing with a ten-minute-minimum billing period. The claim is you’ll be able to start, scale, and shutdown in 90 seconds or less. 

The service comes with built-in integration with other Google Cloud Platform services, such as BigQuery, Cloud Storage, Cloud Bigtable, Cloud Logging, and Cloud Monitoring. You can interact with clusters and Spark or Hadoop jobs through the Google Developers Console, the Google Cloud SDK, or the Cloud Dataproc REST API. When you're done with a cluster, it can be turned off to save money, and data is safe because Cloud Dataproc is integrated with Cloud Storage, BigQuery, and Cloud Bigtable. A free 60-day trial of the Google Cloud Platform is available.

The fact the service is based around Spark and Hadoop and the other elements of the ecosystem such as Pig and Hive, developers will be able to begin work without needing to learn new tools or APIs, and existing projects or ETL pipelines can be moved to the new service without redevelopment.

dataproclogo

 

 

More Information

Google Cloud Dataproc

Related Articles

Google Announces Big Data the Cloud Way 

Google Moves On From MapReduce, Launches Cloud Dataflow

Google Cloud Dataflow SDK 

Google BigQuery Service

Major Update to Google BigQuery

BigQuery Now Open to All 

 

To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, FacebookGoogle+ or Linkedin,  or sign up for our weekly newsletter.

 

Banner


Sequin - Open Source Message Stream Built On Postgres
31/10/2024

Sequin is a tool for capturing changes and streaming data out of your Postgres database, guaranteeing exactly once processing. What does that mean?



Apache Lucene Improves Sparce Indexing
22/10/2024

Apache Lucene 10 has been released. The updated version adds a new IndexInput prefetch API, support for sparse indexing on doc values, and upgraded Snowball dictionaries resulting in improved tokeniza [ ... ]


More News

 

espbook

 

Comments




or email your comment to: comments@i-programmer.info