Apache Pinot 1.0 Released
Written by Kay Ewbank   
Tuesday, 26 September 2023

Apache Pinot 1.0 has been released. The real-time distributed OLAP datastore has been purpose-built for low-latency, high-throughput analytics.

Pinot was originally developed at LinkedIn in 2013 to enable the company to run various queries including showing users who had viewed their profile. Highlights of the 1.0 release are an extension to the multi-stage query engine, upsert capabilities (delete, metadata TTL, segment preloading and segment compaction), NULL value support in queries, support for SPI-based pluggable indexes, and improvements to the Spark 3 connector.

pinot

 

Pinot can perform typical analytical operations such as slice and dice, drill down, roll up, and pivot on large scale multi-dimensional data.

Apache describes Pinot as being ideal for ingesting and immediately querying data from streaming or batch data sources (including, Apache Kafka, Amazon Kinesis, Hadoop HDFS, Amazon S3, Azure ADLS, and Google Cloud Storage).

It provides ultra low-latency analytics even at extremely high throughput, and is a columnar data store with several smart indexing and pre-aggregation techniques. The developers say it offers consistent performance based on the size of your cluster and an expected query per second (QPS) threshold.

The current version of Pinot's features include real-time support for upsert mutations (if exists update else insert) that are used when it's not clear if the respective row is already present in the database. It also supports query-time Native JOINs through its multi-stage query engine which efficiently manages complex analytical queries, including JOIN operations. The developers say this engine alleviates computational burdens by offloading tasks from brokers to a dedicated intermediate compute stage. It can also handle semi-structured or unstructured data, and the team says offers "improving" ANSI SQL compliance.

The team says the original query engine works very well for simpler filter-and-aggregate queries, but the broker could become a bottleneck for more complex queries. The new engine resolves this by introducing intermediary compute stages on the query servers, and brings Apache Pinot closer to full ANSI SQL semantics.

Apache says that for application developers, Pinot works well as an aggregate store that sources events from streaming data sources, such as Kafka, and makes it available for a query using SQL. You can also use Pinot to aggregate data across a microservice architecture into one easily queryable view of the domain. 

Apache Pinot is available now.

 pinot

More Information

Apache Pinot Website

Related Articles

Apache Iceberg Improves Spark Support

Spark BI Gets Fine Grain Security

Spark Announcements

Kafka Adds KRaft-Based Authorizer  

Kafka 3.1 Adds OIDC Support

Kafka 3.0 Released With KRaft 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Apache Fury Adds Optimized Serializers For Scala
31/10/2024

Apache Fury has been updated to add GraalVM native images and with optimized serializers for Scala collection. The update also reduces Scala collection serialization cost via the use of  encoding [ ... ]



The Feds Want Us To Move On From C/C++
13/11/2024

The clamour for safe programming languages seems to be growing and becoming official. We have known for a while that C and C++ are dangerous languages so why has it become such an issue now and is it  [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Tuesday, 26 September 2023 )