Apache Samza Adds Container Placements API
Written by Kay Ewbank   
Thursday, 09 July 2020

Apache's distributed stream processing framework Samza has been updated to version 1.5. Improvements include a simplified job submission workflow that provides improved security, and the ability to move containers without having to restart an application.

 

Samza is an open source framework originally developed alongside Kafka by LinkedIn before being made open source and taken over by the Apache Software Foundation. It was originally developed to provide a simple way to develop and run stream processing jobs that can be used by non-programmers as well as developers. It uses Apache Kafka for messaging, Apache Hadoop YARN for fault tolerance, processor isolation, security, and resource management, and RockdDB for local state support.  It has a simple callback-based “process message” API comparable to MapReduce, and supports managed state via snapshotting and restoration of a stream processor’s state.

samza

This release adds a Container Placements API that means you can now move or restart one or more containers (either active or standby) of your cluster based applications from one host to another without restarting your application. The API can also be used to build maintenance, balancing and remediation tools.

Job Runner has been simplified and will now simply submit Samza job to Yarn RM without executing any user code. Job planning will happen on the ClusterBasedJobCoordinator instead. The developers say this simplified workflow addresses security requirements where job submissions need to be isolated in order to execute user code. It also makes life simpler where deployment failure could happen at multiple places.

Samza now has better facilities for managing and monitoring local state with the addition of KV store metrics for RocksDB, and a fix meaning Get store names returns correct store names in the presence of side inputs. The Samza SQL API has been improved with support for subqueries in joins and validation of the argument types in SamzaSQL UDF at the execution planning phase. There's also a new system producer for Azure blob storage.

 

 samza

More Information

Samza Site

Related Articles

Apache Samza Adds SQL

Apache Bigtop Adds OpenJDK 8 Support 

Apache Fluo Improves Spark Integration

Kafka 1 Becomes More Tolerant

Comparing Kafka To RabbitMQ

Apache Kafka Adds New Streams API

GoKa Stream Processing For Kafka

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


DuckDB And Hydra Partner To Get DuckDB Into PostgreSQL
11/11/2024

The offspring of that partnership is pg_duckdb, an extension that embeds the DuckDB engine into the PostgreSQL database, allowing it to handle analytical workloads.



AI Breakthrough For Robot Surgery
17/11/2024

Using imitation learning, a robot has learned to perform surgical procedures as skillfully as human surgeons, bringing the field of robotic surgery closer to true autonomy.


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Thursday, 09 July 2020 )