Amazon Redshift Updates
Written by Kay Ewbank   
Thursday, 05 December 2019

Amazon has announced a number of updates to Redshift, its cloud-based data warehouse service.

Redshift data can be analyzed using ‘normal’ SQL-based tools and business intelligence applications, and is designed to be easy to set up and manage - clusters can be set up using a few clicks in the AWS Management Console. Queries can be distributed and parallelized across multiple nodes. Amazon has automated most of the common administrative tasks associated with provisioning, configuring, monitoring, backing up, and securing a data warehouse to make Redshift easier to administer. Redshift is based on ParAccel technology from Actian (formerly known as Ingres), which Amazon acquired in 2013.

aws

The updates announced at Amazon's Re:Invent conference start with the support for data lake export in Apache Parquet format.  You can now unload the result of an Amazon Redshift query to your Amazon S3 data lake as Apache Parquet. The Parquet format is up to twice as fast to unload and uses up to six times less storage in Amazon S3, compared to text formats.

The next improvement to be announced is a preview of support for federated querying. The Amazon Redshift Federated Query feature lets you query and analyze data across operational databases, data warehouses, and data lakes. With Federated Query, you can now integrate queries on live data in Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL with queries across your Amazon Redshift and Amazon S3 environments.

Another improvements to queries in Redshift is the preview of Advanced Query Accelerator (AQUA) for Amazon Redshift. This is a new distributed and hardware-accelerated cache that Amazon says means Redshift can run up to ten times faster than any other cloud data warehouse. AQUA attempts to avoid the bottleneck of having to move data from centralized storage to compute clusters for processing, where the network bandwidth needed to move the data can be the bottleneck. Instead, AQUA does a substantial share of data processing in-place on its hardware-accelerated cache. Data intensive tasks such as such as filtering and aggregation are carried out closer to the storage layer so minimizing data movement between where data is stored and compute clusters.

The final improvement to Redshift is support for materialized views - again, this is in preview. Materialized views can speed up query performance for repeated and predictable analytical workloads. They store pre-computed results of queries and maintain them by incrementally processing the latest changes made to the source tables. Any query that uses the materialized views gets the pre-computed results much faster. Materialized views can be created based on one or more source tables using filters, projections, inner joins, aggregations, grouping, functions and other SQL constructs.

More details of all the new features can be found on the Redshift website. 

aws 

 

More Information

Amazon Redshift

Related Articles

Amazon Releases PartiQL, A One Stop Query Language

Amazon Updates Data Offerings

Amazon Redshift Ready For Data

Amazon Redshifts Big Data

New AWS Managed Services

Amazon RDS Adds Replication Feature

 

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


It Matters What Language AI Thinks In
23/10/2024

We are currently polarized in how we think about Large Language Models. Some say that they are just overgrown autocompletes and some say that they have captured some aspects of intelligence. How well  [ ... ]



Apache Lucene Improves Sparce Indexing
22/10/2024

Apache Lucene 10 has been released. The updated version adds a new IndexInput prefetch API, support for sparse indexing on doc values, and upgraded Snowball dictionaries resulting in improved tokeniza [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Friday, 06 December 2019 )