Apache Druid Adds Ranger Integration |
Written by Kay Ewbank | |||
Monday, 24 August 2020 | |||
Apache Druid has been updated with better performance, easier and more flexible data ingestion, and Apache Ranger authorization integration. Druid is a modern cloud-native, stream-native, analytics database, designed for workflows where fast queries and ingest really matter. Druid is designed for instant data visibility, ad-hoc queries, operational analytics, and handling high concurrency, and provides an open source alternative to data warehouses. Druid was originally developed at a startup called Metamarkets to power an all-in-one analytics solution for programmatic digital advertising. The improvements to performance are the result of the support for vectorized queries that was introduced in Druid 0.16 being turned on by default as it is now stabilized and "battle-tested". GroupBy and Timeseries query types can run in vectorized mode, which speeds up query execution by processing batches of rows at a time, leading to a performance gain of betwen two and five times. Easier and more flexible ingestion is another improvement. More data sources are supported. specifically, there is a newly added SqlInputSource that allows you to ingest data from MySQL and Postgres databases. Native batch ingestion has also been improved to support Avro Object Container Files. Until now those sources had to be translated into intermediate file formats that Druid could consume. With this release, if you have any data currently in MySQL, Postgres, or in Avro format, you can load them directly into Druid with a single step. Apache Ranger authorization integration has been added to this release. Ranger is an open-source security solution for the Hadoop ecosystem, and the new integration with Druid means cluster administrators can restrict access to data sources by granting read-only or read-write permissions. Cloud platform support has also been improved. Druid also now has support for Alibaba Object Storage Service when used as Druid deep storage. This is the native object storage solution offered by Alibaba Cloud. Another cloud improvement means that Druid overlord now supports autoscaling using Managed instance groups on Google Compute Engine platform, More InformationRelated ArticlesApache Druid Improves Compaction Kafka Graphs Framework Extends Kafka Streams Amazon Introduces Kinesis Analytics Cloudera Extends Apache HBase To Use Amazon S3 Hadoop 3 Adds HDFS Erasure Coding Amazon Redshift Updates
To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.
Comments
or email your comment to: comments@i-programmer.info |
|||
Last Updated ( Monday, 24 August 2020 ) |