Amazon Releases AWS Glue 5
Written by Kay Ewbank   
Monday, 10 February 2025

Amazon has announced the general availability of AWS Glue 5.0, with improved performance, enhanced security, and support for Amazon Sagemaker Unified Studio and Sagemaker Lakehouse.

AWS Glue is a serverless data integration service that Amazon says makes it easier to discover, prepare, move, and integrate data from multiple sources for analytics, machine learning and application development.

awslogo

Glue includes a collection of libraries, engines, and tools developed by the open source community. AWS Glue consists of a Data Catalog which is a central metadata repository; an ETL engine that can automatically generate Scala or Python code; a flexible scheduler that handles dependency resolution, job monitoring, and retries; and AWS Glue DataBrew for cleaning and normalizing data with a visual interface.

The performance and security improvements to AWS Glue 5.0 come largely from upgrades to the engine to Apache Spark 3.5.2, Python 3.11, and Java 17. Amazon says that Glue 5.0 uses the AWS performance optimized Spark runtime, which they say is 3.9 times faster than open source Spark. This and other changes means Glue 5.0 is 32% faster than AWS Glue 4.0 and reduces costs by 22%.

Glue 5.0 also updates its open table format support to Apache Hudi 0.15.0, Apache Iceberg 1.6.1, and Delta Lake 3.2.0. This means users get stronger tools for improving performance, cost, governance, and privacy in their data lakes.

AWS Glue 5.0 also adds Spark native fine grained access control with AWS Lake Formation, meaning users can apply table, column, row, and cell level permissions on Amazon S3 data lakes.

Glue 5.0 also adds support for Sagemaker Lakehouse. This means organizations can unify their data across Amazon S3 data lakes and Amazon Redshift data warehouses. SageMaker Lakehouse lets customer unity all their data across Amazon Simple Storage Service (Amazon S3) data lakes and Amazon Redshift data warehouses. Its aim is to let organizations build analytics and AI/ML applications on a single copy of data. SageMaker Lakehouse can also be used to access and query data in-place with all Apache Iceberg–compatible tools and engines.

AWS Glue 5 is available now.

awslogo

More Information

Amazon Glue Webpage

Related Articles

Amazon Announces AWS Glue Data Quality

AWS Glue 4 Adds Pandas Support

Amazon Open Sources Python Library for AWS Glue

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Robot Vacs Move Towards Real Robots
12/01/2025

Robot vacuum cleaners swept the floor at CES 2025 and while this might not seem very exciting, think again. Adding AI to these everyday home helpers has already made them more efficient at what they d [ ... ]



Google Slashes Code Migration Time With Gemini
22/01/2025

Google computer scientists have given details of the way in which Google is using AI to dramatically reduce the time required for code migrations. In the case of a switch between two Java time librari [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info