Google Launches Cloud Dataproc
Written by Kay Ewbank   
Friday, 02 October 2015

Google has launched a beta version of Google Cloud Dataproc, a service which will provide an alternative way to manage Hadoop and Spark more quickly and easily.

Google continues to expand its range of cloud services for working with Big Data, see Google Announces Big Data the Cloud Way. Now available in beta, Cloud Dataproc is a managed Spark and Hadoop service that lets you use open source data tools for batch processing, querying, streaming, and machine learning. The aim is to let you you create clusters quickly, manage them easily, and save money by turning clusters off when you don't need them.

dataproc

The service can be used from three clusters up to hundreds of clusters, and is priced at 1 cent per virtual CPU in your cluster per hour on top of the usual cost of running virtual machines and data storage. The clusters can include preemptible instances that have lower compute prices, and you’re charged using minute-by-minute billing with a ten-minute-minimum billing period. The claim is you’ll be able to start, scale, and shutdown in 90 seconds or less. 

The service comes with built-in integration with other Google Cloud Platform services, such as BigQuery, Cloud Storage, Cloud Bigtable, Cloud Logging, and Cloud Monitoring. You can interact with clusters and Spark or Hadoop jobs through the Google Developers Console, the Google Cloud SDK, or the Cloud Dataproc REST API. When you're done with a cluster, it can be turned off to save money, and data is safe because Cloud Dataproc is integrated with Cloud Storage, BigQuery, and Cloud Bigtable. A free 60-day trial of the Google Cloud Platform is available.

The fact the service is based around Spark and Hadoop and the other elements of the ecosystem such as Pig and Hive, developers will be able to begin work without needing to learn new tools or APIs, and existing projects or ETL pipelines can be moved to the new service without redevelopment.

dataproclogo

 

 

More Information

Google Cloud Dataproc

Related Articles

Google Announces Big Data the Cloud Way 

Google Moves On From MapReduce, Launches Cloud Dataflow

Google Cloud Dataflow SDK 

Google BigQuery Service

Major Update to Google BigQuery

BigQuery Now Open to All 

 

To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, FacebookGoogle+ or Linkedin,  or sign up for our weekly newsletter.

 

Banner


pg_parquet - Postgres To Parquet Interoperability
28/11/2024

pg_parquet is a new extension by Crunchy Data that allows a PostgreSQL instance to work with Parquet files. With pg_duckdb, pg_analytics and pg_mooncake all of which can access Parquet files, is  [ ... ]



Discover PostgreSQL How-Tos
16/12/2024

A veritable treasure trove of assorted how-to recipes for PostgreSQL, stored as a Github repository, has been started by Nikolay Samokhvalov, well known in the PostgreSQL world.


More News

 

espbook

 

Comments




or email your comment to: comments@i-programmer.info