Apache Druid Adds Ranger Integration
Written by Kay Ewbank   
Monday, 24 August 2020

Apache Druid has been updated with better performance, easier and more flexible data ingestion, and Apache Ranger authorization integration.

Druid is a modern cloud-native, stream-native, analytics database, designed for workflows where fast queries and ingest really matter. Druid is designed for instant data visibility, ad-hoc queries, operational analytics, and handling high concurrency, and provides an open source alternative to data warehouses. Druid was originally developed at a startup called Metamarkets to power an all-in-one analytics solution for programmatic digital advertising.

druid

The improvements to performance are the result of the support for vectorized queries that was introduced in Druid 0.16 being turned on by default as it is now stabilized and "battle-tested". GroupBy and Timeseries query types can run in vectorized mode, which speeds up query execution by processing batches of rows at a time, leading to a performance gain of betwen two and five times.

Easier and more flexible ingestion is another improvement. More data sources are supported. specifically, there is a newly added SqlInputSource that allows you to ingest data from MySQL and Postgres databases. Native batch ingestion has also been improved to support Avro Object Container Files. Until now those sources had to be translated into intermediate file formats that Druid could consume. With this release, if you have any data currently in MySQL, Postgres, or in Avro format, you can load them directly into Druid with a single step.

Apache Ranger authorization integration has been added to this release. Ranger is an open-source security solution for the Hadoop ecosystem, and the new integration with Druid means cluster administrators can restrict access to data sources by granting read-only or read-write permissions.

Cloud platform support has also been improved. Druid also now has support for Alibaba Object Storage Service when used as Druid deep storage. This is the native object storage solution offered by Alibaba Cloud. Another cloud improvement means that Druid overlord now supports autoscaling using Managed instance groups on Google Compute Engine platform, 

druid

More Information

Druid Home Page

Related Articles

Apache Druid Improves Compaction

Kafka 2 Adds Support For ACLs

Kafka Graphs Framework Extends Kafka Streams

Amazon Introduces Kinesis Analytics

Cloudera Extends Apache HBase To Use Amazon S3

Hadoop 3 Adds HDFS Erasure Coding

Amazon Redshift Updates

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


GitHub Introduces Code Scanning
26/03/2024

GitHub has announced a public beta of a code scanner that automatically fixes problems. The new feature was announced back in November, but has now moved to public beta status.  



Bun Shell Released
29/02/2024

The developers of the Bun JavaScript runtime have released Bun Shell, a new experimental embedded language and interpreter in Bun that lets you run cross-platform shell scripts in JavaScript and TypeS [ ... ]


More News

raspberry pi books

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Monday, 24 August 2020 )