Apache Ignite Adds Spark DataFrames Support
Written by Kay Ewbank   
Monday, 29 March 2021

Apache Ignite, a distributed database for high-performance computing with in-memory speed, has been updated with support for Spark DataFrames and machine learning.

Ignite can be used as a traditional SQL database via JDBC drivers, ODBC drivers, or its own native SQL APIs. By default, it runs purely in-memory, but clusters can be configured to run on a mix of disk and memory.

ignite

It supports 'co--located compute in Java, Scala, Kotlin, C#, and C++, meaning that you can create the equivalent of stored procedures in modern JVM languages, C# or C++ to develop and execute custom tasks across a distributed database. Ignite can operate in a strongly consistent mode that provides full support for distributed ACID transactions across multiple cluster nodes, caches, tables, and partitions.

Ignite supports continuous queries, meaning that rather than using a trigger to react to specific events, in Ignite you can write and run continuous queries written in languages such as Java or C# that process streams of changes on the database and application side.

Highlights of the latest version of the distributed database are general availability of its machine learning feature, and support for Spark DataFrames.

Ignite machine learning has TensorFlow integration as well as built-in algorithms and tools so it can be used to build scalable machine learning models without having to transfer data out of Ignite. You can train, deploy, evaluate, and update your ML and DL models continuously and at scale.

Perhaps even more strikingly, the latest version adds support for Spark DataFrames. The announcement emphasizes that this isn't a joke or misprint, saying that many early Ignite adopters have been hoping for this for years. A Spark DataFrame is a distributed collection of data organized into named columns, and means Spark can use the Catalyst query optimizer to produce more efficient query execution plans.

The developers say that Ignite expands DataFrame, simplifying development and improving data access times whenever Ignite is used as memory-centric storage for Spark. The support means Ignite users can share data and state across Spark jobs by writing and reading DataFrames to and from Ignite; and can improve the performance of SparkSQL queries by optimizing Spark query execution plans with the Ignite SQL engine which include​s advanced indexing. 

 ignite

More Information

Apache Ignite Website

Related Articles

Spark 3 Improves Python and SQL Support

Apache Superset Reaches Top Level Project Status

Apache Daffodil Now Top Level Project

Facebook Apollo NoSQL Database  

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Robot Vacs Move Towards Real Robots
12/01/2025

Robot vacuum cleaners swept the floor at CES 2025 and while this might not seem very exciting, think again. Adding AI to these everyday home helpers has already made them more efficient at what they d [ ... ]



The Single Issue Of 2025 - AI
01/01/2025

We have spent a lot of time talking about AI and its impact on programming over the past year, but the new year will confirm that it's a game changer or just another passing fad. It is the one big iss [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Monday, 29 March 2021 )