SQL Query Design Patterns and Best Practices |
Page 3 of 3
Chapter 11: Handling JSON Data in SQL Server JSON is a very common method of describing data, in many ways it is a successor to the often much wordier XML. The basics of JSON are discussed, describing its 2 main components: Objects and Arrays, before moving onto the various JSON functions (e.g. isjson, json_value). Each of these functions is described with helpful example code. Unlike XML, which has its own data type in SQL Server, JSON is stored in a (n)varchar data type, which can make indexing/performance troublesome. Useful code is provided for various common scenarios (e.g. extracting JSON data from SQL Server, modifying JSON data from SQL Server, modifying JSON data in SQL Server). I found this chapter a very useful introduction to JSON processing within SQL Server. The examples provided should help you in your own projects. Chapter 12: Integrating File Data and Data Lake Content with SQL Getting data directly from different data sources is a common requirement, it save time, effort, and complexity compared with ETL jobs. Here we look at obtaining data from a file within a data lake. The function used is OPENROWSET, it can be used to obtain files in various formats (e.g. parquet, CSV). Useful example code is provided on how to get a file. The importance of security (via RBAC and ACL) is noted. This brief chapter ends with a look at using External Tables, to obtain data in Hadoop. The setup seems to be very similar to using external tables within SQL Azure to query data across databases. While the chapter may be useful if you must get data from a data lake, I’m not sure how common this currently is, especially with people new to SQL Server (the presumed readers of the book judging from the earlier chapters). I’m not sure this chapter is needed in this book. Chapter 13: Organizing and Sharing Your Queries with Jupyter Notebooks Jupyter Notebooks allow you to organise and share your commented SQL queries. They are popular with other languages (e.g. Python), and are mostly used by data scientists. They are integrated into Azure Data Studio (ADS). Creating a simple notebook is illustrated with a brief walkthrough (mostly menu options). Adding an existing notebook to a book is a little more complex, as shown. Adding queries and comments to your notebook, via a simple editor is shown, as is connecting the notebook to a database, and running it. ADS has the advantage of having a built-in feature to display relevant query output as a chart. Results can be exported as CSV, Excel, JSON etc. The chapter ends with a look at using GitHub to allow collaboration on your notebooks. A useful overview of using Jupyter notebooks, but is this needed in an introductory book on querying SQL Server? Appendix: Preparing Your Environment This appendix walks you through setting up your environment, allowing you to follow along with the book’s examples. Instructions are provided for installing:
A helpful contribution to those that are new to these tools. Conclusion The title of this book sounds quite grand, however it is really an introduction to various areas of querying SQL Server. I suspect the title of this book is wrong, “An introduction to querying SQL Server” might be better. That said, most chapters are easy to read, instructive, providing useful topic discussions together with helpful template code. There’s a good flow within each chapter with later ideas building on earlier examples The book feels like a mishmash of topics, with little cohesion between the chapter topics. I’m not sure how the chapter topics where decided, but there seems to be plenty of omissions. For example, there’s a whole chapter introducing the CASE statement, but nothing about IF/ELSE, WHILE etc. Similarly, CTEs are discussed, but nothing is said about user-defined function, table variables, temporary tables, etc. I would expect an introductory book to follow a well-defined set of topics, rather than cherry pick them. There are also some topics included, that are inappropriate in an introductory text for query writers, these include data lakes, index creation etc. Several chapters provide only the briefest introduction to the topic e.g. security, and some lack basic content e.g. query plans. Azure Data Studio is discussed, but little is said about the most common SQL Server tool, SQL Server Management Studio (SSMS). Nothing is said about the often-troubled relationship between code readability and performance (e.g. with CTEs). With small data volumes, as with the database used in the examples, performance may be adequate, but with large data volumes, performance can be terrible. The book’s current title is too grandiose, the book is essentially an introduction to query using SQL Server, with some very selective topics. Some example template code is given – but it falls short of design patterns and best practices. Perhaps the editors could have looked at other similar “An introduction to querying SQL Server” books, to understand the scope and depth of the content expected. For recommendations of SQL Server Books, see Pick Of The Shelf - SQL Server in Programmer's Bookshelf. To keep up with our coverage of books for programmers, follow @bookwatchiprog on Twitter or subscribe to I Programmer's Books RSS feed for each day's new addition to Book Watch and for new reviews.
|
|||||||
Last Updated ( Tuesday, 18 July 2023 ) |