Apache Kudu Improves Web Interface
Apache Kudu Improves Web Interface
Written by Kay Ewbank   
Monday, 26 June 2017

Apache Kudu 1.4 has been released with improvements to the usability of the Kudu web interfaces, and a new file system check utility. 

Apache Kudu was originally a Cloudera project that is now part of the Apache Hadoop ecosystem. Apache says it can be used to enable fast analytics on fast data. In practical terms, Kudu is a columnar storage engine that fills the gap between the Hadoop Distributed File System (HDFS) and the HBase NoSQL database.

Kudu tables have a primary key made up of one or more columns, and uses techniques such as run-length encoding, differential encoding, and vectorized bit-packing to combine efficient use of storage with fast data reading. It is intended for use with structured data that supports low-latency random access together with efficient analytical access patterns. For "NoSQL"-style access, you can choose between Java, C++, or Python APIs

Explaining Kudu's role as a 'good citizen' in a Hadoop cluster, the developers say you can stream data into Kudu from live real-time data sources using the Java client, and then process it immediately upon arrival using Spark, Impala, or MapReduce. You can even transparently join Kudu tables with data stored in other Hadoop storage such as HDFS or HBase. It can share data disks with HDFS DataNodes, and can operate in a RAM footprint as small as 1 GB for light workloads.

 

kudu1

 

The C++ and Java client libraries have been updates in the new version so they can alter storage attributes such as encoding and compression, as well as the default value of existing columns. The C++ client library comes with an experimental KuduPartitioner API which you can use to map rows to their associated partitions and hosts efficiently. The Java client library has also been updated to support enabling fault tolerance on scanners.

Kudu now includes the optional ability to compute, store, and verify checksums on all pieces of data stored on a server. Prior versions only performed checksums on certain portions of the stored data. 

The usability of the Kudu web interfaces has been improved, particularly for the case where a server hosts many tablets or a table has many partitions. Pages that list tablets now include a top-level summary of tablet status and show the complete list under a toggleable section.

The Maintenance Manager has also been improved. It makes better use of the configured maintenance threads, and will now aggressively schedule flushes of in-memory data when memory consumption crosses 60% of the configured process-wide memory limit.

The Kudu command line tool has also been improved with new advanced administrative commands.

kudu

More Information

Kudu Website

Related Articles

Apache Arrow Adds Streaming Binary Format 

HBase Adds MultiWAL Support

Apache Kafka Adds New Streams API

Apache Beam Moves To Top Level

HBase Adds MultiWAL Support

Spark BI Gets Fine Grain Security

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin.

 

Banner


Mozilla's Plan For Easier Web Development
23/10/2017

With a view to making web development just a little easier Mozilla is partnering with Microsoft, Google, the W3C and Samsung to create cross-browser documentation on MDN.



Theano To Cease Development After Version 1.0
03/10/2017

Major development of Theano, the numerical computation library for Python developed as an open source project by Yoshua Bengio's Machine Learning group at the University of Montreal, is coming to an e [ ... ]


More News

 

 
 

 

blog comments powered by Disqus

Last Updated ( Monday, 26 June 2017 )
 
 

   
Banner
RSS feed of news items only
I Programmer News
Copyright © 2017 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.