Apache Flink ML 2.0 Released
Written by Kay Ewbank   
Thursday, 27 January 2022

Flink ML 2.0.0 has been released. Flink ML is a library that provides APIs and infrastructure for building stream-batch unified machine learning algorithms, that can be easy-to-use and performant with (near-) real-time latency.

Apache Flink is an open source platform for distributed stream and batch data processing, with a streaming dataflow engine for data distribution and distributed computations over data streams.


The updated version of Flink ML is described as a major refactor of the earlier Flink ML library with major new features that extend the Flink ML API and the iteration runtime, such as supporting stages with multi-input multi-output, graph-based stage composition, and a new stream-batch unified iteration library.

The developers have also added five algorithm implementations in this release, which is the start of a long-term initiative to provide a large number of off-the-shelf algorithms in Flink ML.

The new support for stages requiring multi-input multi-output means that algorithm developers can assemble a machine learning workflow as a directed acyclic graph (DAG) of pre-defined stages. This workflow can then be configured and deployed without users knowing the implementation details of this graph. This improvement could considerably expand the applicability and usability of Flink ML.

The next improvement is the addition of support for online learning with APIs exposing model data. The support has been added to handle situations where there's a long-running job that keeps processing training data and updating a machine learning model. The traditional Estimator/Transformer paradigm does not provide APIs to expose this model data in a streaming manner, meaning users have to repeatedly call fit() to update model data, which is very inefficient. The new release means model data can be exposed as an unbounded stream, and algorithm users can then transfer the model data to web servers in real-time and use the up-to-date model data to do online inference.

Other improvements include simpler parameter handling for algorithms, and new tools for composing DAG of stages into a new stage. There's also a new stream-batch unified iteration library that provides the function of transmitting records back to the precedent operators and the ability to track the progress of rounds inside the iteration.

Flink ML 2.0 is available now.


More Information

Flink website

Related Articles

Apache Flink 1.9 Adds New Query Engine

Apache Flink 1.5.0 Adds Support For Broadcast State

Flink Gets Event-time Streaming

FLink Reaches Top Level Status



To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.


Take Harvard's CS50 Introduction to Artificial Intelligence with Python For Free

Need we say more? Python for Artificial Intelligence is a match made in heaven. This free and self-paced course materializes this relationship.

Free Course On ChatGPT Prompt Engineering

DeepLearning.AI, in partnership with OpenAI, is currently offering free training into a key new skill that developers are keen to acquire in order to build applications on top of ChatGTP.

More News





or email your comment to: comments@i-programmer.info

Last Updated ( Thursday, 27 January 2022 )