Snowflake Support For Apache Iceberg Goes GA |
Written by Nikos Vaggalis |
Thursday, 29 August 2024 |
Snowflake has added support for the Iceberg table format and subsequently became able to work with data commonly found in data lakes and warehouses. Lately there's a lot of talk around Apache Iceberg. What makes it so special? Enterprises frequently go beyond relational data stores also hosting data on object stores suitable for their data lakes. If you use Amazon S3 as the underlying object store you can store virtually any amount of data on it all the way to exabytes. Iceberg then is an open table format specification that enables S3 data to be queried like SQL tables. It's important to nothe that Iceberg is not a query or storage engine, it's a specification. In place of the query engine put Snowflake. Iceberg allows Snowflake to work on those files with:
An example from Postgres, which natively cannot work with such formats, is the pg_lakehouse extension that enables Postgres to work with Iceberg by assuming the role of DuckDB. DuckDB is, of course, the alternative to SQLite for analytical workloads; local first, embeddable and suitable for data science work. With pg_lake PostgreSQL is powered up with those high performance analytical query engine capabilities too. We first met Snowflake in 2022, see Snowflake Improves Developer Support. Now it's Snowflake's turn to turn to Iceberg too. It does so by treating Iceberg compatible files as Snowflake tables and provide capabilities to interact directly with the underlying data. The Iceberg tables combine the performance and query semantics of regular Snowflake tables with external cloud storage that customers manage. As such they're deemed ideal for existing data lakes that customers cannot, or choose not to, store in Snowflake; Snowflake then connects to your storage location using an external volume, and Iceberg tables incur no Snowflake storage costs. To create an Iceberg Table first you create an external volume which you reference in the table's CREATE statement. CREATE OR REPLACE ICEBERG TABLE customer_iceberg ( After that you can perform Sql DML operations on it. While the addition of Iceberg is new, there's a lot of work going on as laid out by the roadmap ahead:
To conclude, Iceberg use is increasing, with vendors integrating it or planning to intergrate into their products. For Snowflake in particular, Iceberg support is handy since some organizations with regulatory or other constraints either are not able to store all of their data in Snowflake or prefer to store data externally in open formats. And, Snowflake may be a good choice for those who already use the platform or those looking for a fully managed query engine. More InformationOpen, Interoperable Storage with Iceberg Tables, Now Generally Available Related ArticlesSnowflake Improves Developer Support
To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.
Comments
or email your comment to: comments@i-programmer.info |
Last Updated ( Wednesday, 04 September 2024 ) |