Asterix DB – Big Data Management System

Written by Kay Ewbank

Wednesday, 17 December 2014

A new database designed specifically for managing semi-structured information has been made available in beta form.

Asterix

AsterixDB Big Data Management System (BDMS) has been developed over the last four years by researchers at UC Irvine, UC Riverside, and UC San Diego. The project has been sponsored by the National Science Foundation (NSF), and is designed for ingesting, storing, managing, indexing, querying, and analyzing vast quantities of semi-structured information.

The researchers have taken ideas from three distinct areas—semi-structured data, parallel databases, and data-intensive computing (a.k.a. today’s Big Data platforms), and put them together to create what the developers describe as “a next-generation, open-source software platform that scales by running on large, shared-nothing commodity computing clusters”.

The semi-structured information that the project is aimed at managing can be anything from data that is well-typed and highly regular, to more irregular data where the data values may be textual, and the ultimate schema for the various data types involved may be hard to anticipate up front.

The team has been concentrating on solutions to the problems that big data sets give rise to, such as the need for highly scalable data storage and indexing. It has also been researching semi-structured query processing on very large clusters. Another area of research has been how to combine parallel database techniques with modern data-intensive computing techniques in the hope of finding solutions to the problem of storing and analyzing semi-structured information effectively.

The team has now released a beta version of the AsterixDB system that encapsulates their research.

AsterixDB has a semistructured NoSQL style data model (ADM) resulting from extending JSON with object database ideas. It offers basic transactional capabilities for concurrency and recovery that are akin to those of a NoSQL store.

The query language (AQL) is described as expressive and declarative, and it supports a broad range of queries and analysis over semi-structured data. Queries can access externally stored data (e.g., data in HDFS) as well as data stored natively by AsterixDB.

The parallel runtime query execution engine, Hyracks, has been scale-tested on up to 1000+ cores and 500+ disks. AsteriskDB also supports partitioned LSM-based data storage and indexing. This is designed to enable efficient ingestion and management of semi-structured data. Secondary indexing options include B+ trees, R trees, and inverted keyword (exact and fuzzy) index types, and you can also create fuzzy and spatial queries. The data types supported include spatial and temporal data in addition to integer, floating point, and textual.

Writing about the beta, the researchers say they “are hoping that the arrival of AsterixDB will mark the beginning of the ‘BDMS era’”. They hope AsterixDB will be useful for a much broader class of problems than can be addressed with any one of today’s current Big Data platforms and related technologies such as Hadoop, Pig, Hive, HBase, MongoDB, and so on. That’s quite a big ambition, and it’ll be interesting to see whether they succeed.

Asterixicon

More Information

AsterixDB

National Science Foundation (NSF)

MariaDB Enterprise Updated

Apache CouchDB 1.6.0 Released

Facebook Apollo NoSQL Database

MongoDB 2.6 Released

To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, Facebook, Google+ or Linkedin, or sign up for our weekly newsletter.

Ingres vs Postgres MVCC Explained With Neo4j's LLM Knowledge Graph Builder
14/04/2025

LLM Knowledge Graph Builder is an application designed to turn
unstructured data such as pdfs, text documents, YouTube videos, and web pages, into a knowledge graph stored in Neo4j, promising much bet [ ... ]

+ Full Story

Early 2025 Java Conferences Galore
09/05/2025

The last few months we've seen an increase in Java conferences. We'll try to not just enumerate, them but also mention the key talks in each of them.

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Wednesday, 17 December 2014 )

Recent Articles

Recent Book Reviews

Popular Articles

More Information

Related Articles

Comments