SQL Server 2016 & Azure Data Lake
Written by Kay Ewbank   
Monday, 02 November 2015

Microsoft has announced CTP 3 of SQL Server 2016 along with Azure Data Lake.

sql2016icon

The latest Community Technology Preview (CTP) of SQL Server 2016 was announced at Microsoft’s PASS Summit. The new version keeps data continuously encrypted, whether in the database or while being transferred. Support for R statistical analysis has been added, along with the inclusion of the PolyBase facility within SQL Server. PolyBase lets you use T-SQL statements to access data stored in Hadoop or Azure Blob Storage and query it in an ad-hoc fashion.

It also lets you query semi-structured data and join the results with relational data sets stored in SQL Server. The other main addition is the ability to archive cold data to Azure. This preview also includes new Business Intelligence (BI) capabilities for SQL Server Analysis Services and SQL Server Reporting Services. Support for mobile BI hasn’t made it into the current preview; the intention is that this will be added in the next few months.

Writing about the encryption feature on the SQL Server blog , Joseph Sirosh, Corporate Vice President of the Data Group at Microsoft, described it as an industry first, saying it:

“is based on technology from Microsoft Research and helps protects data at rest and in motion. Using Always Encrypted, SQL Server can perform operations on encrypted data and – best of all – the encryption key resides with the application in the customers’ trusted environment.”

The improvements to SQL Server Analysis Services (SSAS) and SQL Server Reporting Services (SSRS) include an improved version of DirectQuery that means you can access external data sources like SQL Server Columnstore. This improves the use of SSAS as a semantic model over your data for consistency across reporting and analysis without storing the data in Analysis Services.

The SQL Server Reporting Services 2016 has improved the way it paginates reports, and has updated tools for designing reports. You can now pin paginated reports items to the Power BI dashboard to make them easier to share. New Mobile BI capabilities are also going to be added to Reporting Services over the coming months.

The inclusion of support for the open source R statistical language has been achieved following Microsoft’s acquisition of Revolution Analytics earlier this year. The company had a commercial version of R, and also provided services for the language. The CTP of SQL Server is integrated with the Revolution Analytics R package, meaning that you can run R analytics within SQL Server.

The final improvement is the inclusion of Stretch Database, a feature that handles the archiving of historical data transparently and securely in the Microsoft Azure cloud. When Stretch Database is enabled, it silently migrates your historical data to an Azure SQL Database. The idea is that you get local server performance for hot data and cloud storage for old data without needing to modify your applications. A typical use of the feature would be in a table that has a mix of a small amount of hot data that is frequently accessed or used in queries, and a large amount of old data that is used less frequently but is still occasionally needed.

Another major announcement at PASS involved Azure Data Lake. As we reported at the time, this was discussed at this year’s Build conference as a hyper-scale data store for big data analytic workloads. More details have now been revealed. Azure Data Lake combines analysis options with an exabyte-scale big data store as a fully managed service. Azure Data Lake is part of the Cortana Analytics Suite, and is made up of the Data Lake Store, a single data repository that can be used for data of any size, with the data being accessible for processing and analytics from HDFS applications and tools.

 

azuredatalake

The second element of Data Lake is Azure Data Lake Analytics. This is a new service built on Apache YARN that dynamically scales. It includes U-SQL, a language that, according to a post about Data Lake on the SQL Server blog:

“unifies the benefits of SQL with the expressive power of user code.” 

U-SQL can be used to create scalable distributed queries, so you can analyze data in the store and across SQL Servers in Azure, Azure SQL Database and Azure SQL Data Warehouse.

The third element is Azure HDInsight, Microsoft’s fully managed Apache Hadoop cluster service that comes with a number of open source analytics engines including Hive, Spark, HBase and Storm. Microsoft has announced the general availability of managed clusters on Linux.

For developers, Data Lake can be accessed using Azure Data Lake Tools for Visual Studio, which let you write, debug and tune Azure Data Lake Analytics queries, including U-SQL scripts, from within Visual Studio.

The final major announcement in the group is a public preview of In-Memory OLTP and general availability of Operational Analytics in Azure SQL Database. In-Memory OLTP improves transaction processing performance, and can be used in combination with in-memory analytics (columnstore) and traditional relational store in the same database.

azuredatalakesq

Banner


Fermyon's Spin WebAssembly Version 3.0 Released
26/11/2024

The open source developer tool for building, distributing, and running serverless WebAssembly applications reaches version 3.0. What's new?



Google Opensources Privacy Library
08/11/2024

Google is making a new differential privacy library available as open source. PipelineDP4J is a Java-based library that can be used to analyse data sets while preserving privacy.


More News

 

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Monday, 02 November 2015 )