SQL Server 2017 Developer's Guide |
Page 3 of 3
Author: Dejan Sarka et al
Chapter 13 Supporting R in SQL Server R is a popular language used in data science, with extensive mathematical functionality. Additionally, it has data visualization capabilities for easier reporting. SQL Server has integrated R support into its database engine. The chapter opens with an overview of where R is used (e.g. Machine Learning), followed by a useful tutorial on the basics of the R language. Next, various data structures are introduced, including array and matrices, factors, data frames, and lists. Examples are provided on data manipulated. The chapter moves on to show how data can be expressed visually, using simple graphs and histograms. This is followed by the use of various statistical functions (e.g. mean, min, quartile, standard deviation). It ends with a look at the SQL Server R service, which provides statistical functionality, having provided a very useful introduction to using R in SQL Server. Overall this is a useful, if brief, introduction to the use of R and its place within SQL Server.
Chapter 14 Data Exploration and Predictive Modeling with R in SQL Server This chapter builds on the basics provided in the previously chapter, and shows how to use R in advanced data exploration, statistical analysis, and predictive modelling. These features can greatly enhance SQL Server’s analysis capabilities. The chapter explores some intermediary-level statistics (chi-squared test, null hypothesis, linear regression, ANOVA), providing example R code to illustrate the theory. This is followed with data mining/machine learning techniques (supervised and unsupervised), examining Principal Components Analysis (PCA), Exploratory Factor Analysis (EFA), and K-mean clustering. Similarly, predictive analysis and decision trees are discussed from a theoretical viewpoint. The chapter ends with a look at some useful advanced graphing. I suspect if your maths knowledge is limited, you’ll have to look elsewhere to get a better understanding of the underlying maths discussed. If you’re familiar with the maths, you’ll find this chapter a useful overview of the capabilities of R in data analysis. The previous 14 chapters have features that are largely based on SQL Server 2016. The next 3 chapters relate to features introduced in SQL Server 2017.
Chapter 15 Introducing Python Although the R language is a favourite of data scientists, python is also used extensively, this is perhaps because it is one of the most popular languages developers learn before coming to data science. This chapter opens with a look at how to install python and associated machine learning tools, before outlining what python can do, and providing a brief look at the python language. Two data science libraries (NumPy and Pandas) are discussed together with examples to show their capabilities (e.g. you can get impressive graphs with very little code). This chapter provides a useful, if brief, overview of the python language considering both its specilised data science libraries and its use in SQL Server. How python is used with SQL Server in the future will be interesting (e.g. should python code run on the database or from separate application servers?!)
Chapter 16 Graph Database The rise of social media has led to an increased interest in mapping the many-to-many relationships between objects (e.g. LinkedIn users). These relationships are easily mapped in graph databases, but very difficult in relational databases. The chapter opens with a general and basic discussion of graph theory, explaining Nodes (object of interest) and Edges (relationship between objects) along the way. The use cases where graph databases are useful are briefly discussed (e.g. managing hierarchical relationships), before examining some of the popular graph databases, including:
After setting the background, the chapter now looks at SQL Server 2017’s graph database, discussing node and edge tables, and the use of the MATCH clause to specify criteria to search the graphs. Various graph functions are discussed with helpful example code. The chapter ends with a look at some of the current limitations in the graph database. Overall, this chapter provided a useful introduction to graph databases, together with some helpful example code. As with many Microsoft sub-technologies, the first release has limited functionality – look for enhancements in the near future.
Chapter 17 Containers and SQL on Linux Virtualisation facilitates easier deployment, however, configuration of the hardware, OS, and applications can limit the degree of deployment automation. Containers allow this admin overhead to be reduced by packaging subsets of code and infrastructure modules – in essence improving the portability of applications across OS and physical/virtual environments. The chapter opens with a look at how Microsoft takes advantage of containers to run SQL Server on Windows, Linux, or macOS. Next, the installation of Docker (a popular container software) is shown, followed by the creation of an example container. The chapter extends this talk about containers to show how SQL Server runs on Linux, with a brief discussion on the underlying architecture. The chapter ends with a discussion of some of the limited feature functionality on SQL Server on Linux (e.g. transactional replication), it is suggested this rather long list will be reduced significantly with future releases. This chapter provides a very useful introduction to containers, showing how they provide abstraction to allow easier deployment. This is illustrated with the running of SQL Server on Linux.
(click cover to purchase from Packt)
Conclusion The book’s title is misleading. This book is essentially the SQL Server 2016 version of the book, amended to take into account SQL Server 2017. I would estimate 75% of the book relates to SQL server 2016, and 25% to 2017. In some ways I felt a bit cheated. That said, if you’re unfamiliar with both SQL Server 2016 and 2017, this book is an excellent starting place, and would merit a rating of 4.8 (out of 5). This book aims to introduce you to the salient new and enhanced features in SQL Server 2016 and 2017, and succeeds. It is generally easy to read, well written, with useful explanations, tips, example code, and diagrams. Whilst the book doesn’t cover all the new features (e.g. PolyBase), it does cover the major ones. I suspect the more you know about SQL Server already, the more useful this book will be. Since the book focuses on new and enhanced features, large subject areas are omitted (e.g. database design). Sometimes, before discussing an extended feature (e.g. In-Memory OLTP), a large amount of background information is given with reference to previous editions of SQL Server. Whilst this may be useful if you don’t know the feature, it could be argued it is unnecessary - if you already know SQL Server 2014. It might have been useful to include a section discussing SQL Server changes in terms of some of the industry’s wider trends (e.g. Big Data, Social Media, the Cloud). Overall, if you want to know more about the new and enhanced features in SQL Server 2016 and 2017 together, I can recommend this well-written book. To keep up with our coverage of books for programmers, follow @bookwatchiprog on Twitter or subscribe to I Programmer's Books RSS feed for each day's new addition to Book Watch and for new reviews.
|
|||||||
Last Updated ( Tuesday, 22 May 2018 ) |