Apache Flink 1.9 Adds New Query Engine

Written by Kay Ewbank

Tuesday, 27 August 2019

Significant features on this path are batch-style recovery for batch jobs and a preview of the new Blink-based query engine for Table API and SQL queries.

Apache Flink is an open source platform for distributed stream and batch data processing. It consists of a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams. Flink includes several APIs, including the DataSet API for static data embedded in Java, Scala, and Python; the DataStream API for unbounded streams embedded in Java and Scala; the Table API with a SQL-like expression language embedded in Java and Scala; and the streaming SQL API that enables SQL queries to be executed on streaming and batch tables, with a syntax is based on Apache Calcite.

flinklogo

The main improvements to the new version start with the addition of batch-style recovery for batch jobs, and a preview of the new Blink-based query engine for Table API and SQL queries.

The new batch-style recovery has significantly reduced the time to recover a batch job from a task failure. This covers DataSet, Table API and SQL jobs. Until this version, if a task failed, the recovery of a batch job involved canceling all tasks and restarting the whole job, voiding all progress. You can now configure Flink to limit the recovery to only those tasks that are in the same failover region, the set of tasks that are connected via pipelined data exchanges.

The Blink-based query engine preview is a development of the donation of Blink to Apache Flink. Blink’s query optimizer and runtime for the Table API and SQL have been integrated into Flink, and the query planner has been extended so that there are now two choices of pluggable query processors to execute Table API and SQL statements: the pre-1.9 Flink processor and the new Blink-based query processor. The Blink-based query processor offers better SQL coverage and improved performance for batch queries because it has more extensive query optimization including cost-based plan selection and more optimization rules. Because the query processor is still not fully integrated, in this release the original processor is still the default choice, though you can enable the Blink processor.

Elsewhere, the State Processor API is now fully available and can be used to read and write savepoints with Flink DataSet jobs. Finally, Flink 1.9 includes a reworked WebUI and previews of Flink’s new Python Table API and its integration with the Apache Hive ecosystem.

flinklogo

More Information

Flink website

Apache Flink 1.5.0 Adds Support For Broadcast State

Flink Gets Event-time Streaming

FLink Reaches Top Level Status

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

SQL and Assembly Language In Decline In TIOBE Index
09/06/2025

This month's TIOBE Index is out and the headline asks the question "Where is SQL Going?" Its chart provides the answer of "down", a fate that also applied to Assembly Language. The language that is mo [ ... ]

+ Full Story

Angular 20 Improves Reactivity
10/06/2025

Angular 20 has been released, and the new version has moved a number of APIs to stable, along with new features in the template compiler to align it with TypeScript expressions and to improve the deve [ ... ]

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Recent Articles

Recent Book Reviews

Popular Articles

More Information

Related Articles

Comments