Project Daytona
Written by Kay Ewbank   
Tuesday, 19 July 2011

Microsoft has developed an iterative MapReduce runtime for Windows Azure, code-named Daytona, and it might become open source.

MapReduce systems are designed to overcome the difficulties of making data available to web-based or cloud applications. MapReduce is a programming model that was originally created by Google to make it easier to develop large scale web search apps in data centers.

Daytona makes use of compute and storage services in Azure and streams the data that is being accessed by applications. It provides dynamic data partitioning based on Azure’s cloud storage services. According to the download site, Daytona is designed to support a wide class of data analytics and machine-learning algorithms. It can scale to hundreds of server cores for analysis of distributed data. Project Daytona was developed as part of the eXtreme Computing Group’s Cloud Research Engagement Initiative.

The way that MapReduce works is that the map element takes a request for some computing task and splits the task into smaller elements that can be treated as sub-tasks. These are then distributed to worker nodes that carry out the processing and pass the answer for their part of the problem back to the master node. The reduce step involves the master node combining the answers from all the worker nodes to create the answer to the original problem.


daytona

(Click to enlarge)

 

You can use Daytona in your apps by submitting models written as map-and-reduce functions to the Daytona service. Daytona then deals with executing your algorithm across multiple Azure virtual machines. You have to write a map function and a reduce function, along with a controller - essentially embedded application control code that deals with tasks such as job configuration, submission and management. 

daytonapic

 

The best known and used MapReduce implementation is Hadoop, which is open source. Microsoft has made the suggestion that Daytona may also go open source "pending community feedback".

For more information and to download:

Project Daytona: Iterative MapReduce on Windows Azure

 

 

Banner


Ursina - A Game Engine Powered by Python
08/11/2024

Ursina is a new open source game engine in which you can code any type of game in Python, be it 2-D, 3-D, an application, a visualization, you name it.



DuckDB And Hydra Partner To Get DuckDB Into PostgreSQL
11/11/2024

The offspring of that partnership is pg_duckdb, an extension that embeds the DuckDB engine into the PostgreSQL database, allowing it to handle analytical workloads.


More News

Last Updated ( Tuesday, 19 July 2011 )