IBM Big SQL Sandbox
IBM Big SQL Sandbox
Written by Kay Ewbank   
Tuesday, 19 September 2017

IBM has released a sandbox version of Big SQL for desktop use. The Sandbox comes as a single node docker image, and is designed to let you started with Big SQL and Hortonworks Data platform.

Each Sandbox download comes preconfigured with sample data, a tutorial and an exercise for you to complete, and IBM says you'll be up and running in 30 minutes.

IBM Big SQL is IBM's SQL engine for Hadoop. IBM has worked with Hortonworks to integrate HDP (Hortonworks Data Platform) with IBM Big SQL, and Big SQL 5 extends the capabilities of Hive, and makes use of HBase and Spark to provide an integated analytics option.

 

bigsql

 

Big SQL makes use of IBM Fluid Query to virtualize data from many different data stores such as Hive, HBase, Spark, DB2, Oracle, SQL Server, Netezza, Informix, Teradata, WebHDFS and object store.

IBM Fluid Query was introduced in 2015. It is powered by Netezza technology, and can be used to create federated queries where the data is drawn from a variety of sources, without the users of the data neding to deal with managing multiple data stores or query systems. Fluid Query can also be used to carry out and control bulk data movement between data repositories. Netezza created the first data warehouse appliance, and as an independent company also developed advanced analytics applications. It was bought by IBM in 2010. 

Big SQL offers bi-directional integration with Spark, and supports synthesis between Spark executors and Big SQL worker nodes. Along with the big data support, it also supports SQL dialects from other offerings such as IBM DB2 database and IBM Netezza data warehouse appliances and Oracle database, including built-in support for Oracle’s SQL and PL/SQL dialects. IBM's hope is that applications that were written against Oracle will be moved to run in Big SQL, because they can be moved across with minimal changes.

Big SQL also offers YARN integration through Slider. YARN (Yet Another Resource Negotiator) is Apache's cluster management technology, while Slider extends Hadoop and YARN to let other databases run in YARN without modification. Obviously thinking they hadn't included enough big data names and technologies, IBM has added a new technology to Big SQL called “Elastic Boost”.  IBM says this can improved Big SQL's performance by up to 50% by enabling allocation of multiple workers per node for more efficient CPU and memory utilization.

Big SQL also comes with an ANSI-compliant SQL parser that can run all the 99 TPC-DS queries without the need for query modifications and structured streaming with new APIs.

ibmbigsql

 

More Information

Big SQL Sandbox

IBM Big SQL

Related Articles

SQL At Hadoop Scale 

Hadoop Adds In-Memory Caching

Apache Spark With Structured Streaming

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin.

 

Banner


Visual Studio Live Share For Collaboration
16/11/2017

Visual Studio Live Share, which will bring real-time collaborative editing and debugging to both the full VS IDE and to VS Code, made its debut on the first day of the Microsoft Connect event taking p [ ... ]



Gordon Bell Prize For Earth Shattering Research
22/11/2017

For the second year in a row a Chinese team has been awarded the ACM Gordon Bell Prize, presented annually to recognize outstanding achievement in high-performance computing. This year the focus  [ ... ]


More News

 

 
 

 

blog comments powered by Disqus

Last Updated ( Tuesday, 19 September 2017 )
 
 

   
Banner
Banner
RSS feed of news items only
I Programmer News
Copyright © 2017 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.