Page 1 of 2
Authors: Itzik Ben-Gan, Adam Machanic, Dejan Sarka & Kevin Farlee
Publisher: Microsoft Press
Audience: DBAs and developers
Reviewer: Ian Stirk
This book from SolidQ aims to give you a deeper understanding of T-SQL functionality. With an emphasis throughout on performance, how does it fare?
This book is both an update to, and a combination of, three previous books (T-SQL Querying – published 2009, T-SQL Programming -2009, and T-SQL Using Window Functions - 2012). New material from SQL Server 2012 and 2014 is discussed, including: Window Functions, the new cardinality estimator, sequences, columnstore technology, and in-memory OLTP. It comes from SolidQ and is written by some of the most respected names in the world of SQ Server.
Aimed primarily at developers and administrators, it offers a detailed view of internal architecture and a comprehensive programming reference. It aims to provide a practical approach together with techniques to optimize code. This is not a book for the beginner, it assumes at least one year's solid experience of T-SQL programming, and tuning basics.
Below is a chapter-by-chapter exploration of the topics covered.
Chapter 1 Logical query processing
This chapter opens with an observation about experts - they build their knowledge upon a solid understanding of the basics. So this chapter starts with a look at the basics, logical query processing i.e. the logical order the various parts of a T-SQL statement are executed. Each step is discussed in detail, these steps are:
FROM phase (include virtual table for each of the possible types of join)
WHERE phase (can’t use column aliases here – SELECT not yet applied)
GROUP BY phase (nulls considered equal here)
HAVING phase (filter applied to GROUP BY)
SELECT phase (2 steps: evaluate expressions then apply distinct clause)
ORDER BY phase
Apply the TOP or OFFSET-FETCH filter
The chapter continues with a look at further aspects of logical query processing, including table operators (join, apply, pivot, unpivot), windows functions (work with set of data per underlying row, use OVER clause) and UNION, EXCEPT, and INTERSECT operators.
This chapter provides instructive detail on the logical order that the various parts of a T-SQL statement are executed. Helpful examples are provided to illustrate the points made. The discussions are easy to read. Plenty of useful tips are provided along the way (e.g. a CASE expression with no ELSE clause has an implicit ELSE NULL). Useful links to other chapters in this book and to various websites are provided. All these points apply to the whole book.
Chapter 2 Query tuning
The previous chapter concentrated on the logical. This chapter now moves on to the physical. The chapter looks at query plans together with data access methods, so it’s important to know about the internal data structures.
The internals section takes a look at pages, extents, heaps, B-trees, and nonclustered indexes. The content, structure, and organisation of each is discussed and illustrated. Typical tools used to measure performance are discussed, including: execution plan (estimated and actual), Statistics IO, Statistics Time, and Extended Events (XEs). The impact of a hot/cold cache on performance is noted.
The chapter next discusses various access methods (i.e. how to get the data), in each case, sample code is provided, and the query plan examined and discussed. Access methods examined include:
Table scan/unordered clustered index scan (want all data unsorted or no suitable index)
Unordered covering nonclustered index scan (get required data from index)
Ordered covering nonclustered index scan
Nonclustered index seek + range scan + lookups (typically for small range queries)
Clustered index seek + range scan
The section continues with a look at cardinality estimates (i.e. how the optimizer estimates the number of rows returned), and factors that affect it. While the new cardinality estimator in 2014 produces a better plan in most cases, there are situations where it produces less good query plans, trace flags can be set within the SQL to swap between the cardinality estimators used.
Next some common sense tuning advice is given, tune the longest running queries. Various methods of getting these longest running queries are discussed, including: use of XEs, and use of Dynamic Management Views (DMVs).
This chapter provides detailed discussions of SQL Server’s various physical structures and access methods, a casual reading of these will show this book is not for beginners. Various features are introduced and discussed briefly here, and expanded upon in other sections of the book. If the first chapter represents meaning, this chapter represents implementation.
Chapter 3 Multi-table queries
This chapter looks at various features that are used in joining multiple tables. The chapter opens with a look at subqueries, where one query operates on another. Self-contained subqueries are independent of the outer query – these tend to be easier to debug. Next, correlated subqueries are examined, here the inner query references a column in the outer query. This is followed by a very interesting section on problems due to misbehaving subqueries.
The chapter continues with a look at the various table expressions. In each case the advantage of each is explained. The types examined are:
Derived tables – defined in FROM clause of outer query
Common table expressions (CTEs)
Views – query definition
Table-value functions (TVFs) – like views but accept parameters
The final section of the chapter examines joins, including cross join, inner join, outer join and self-join. Some of the more unusual aspects of joins are also examined, including non-equi joins. The last section looks at the use of the UNION, EXCEPT, and INTERSECT operators.
Again this chapter is full of detailed but readily readable explanations, with lots of useful incidental tips provided too. There’s some very useful code for generating numbers (and used in various examples). Brief mention is given to the use of INNER JOIN or just JOIN – the important things is to be consistent!
Chapter 4 Grouping, pivoting, and windowing
This chapter looks at different grouping features. The chapter opens with a look at the various windows functions – these apply to a window of rows, allowing you to provide summary/ordering data at a lower level of detail. The following types of functions are examined, with useful examples, query plans, outputs and discussions:
Aggregate (e.g. SUM/AVG)
Ranking (e.g. ROW_NUMBER, RANK, NTILE)
Offset (e.g. FIRST_VALUE, LAST_VALUE)
Statistics (e.g. PERCENT_RANK, PERCENT_CONT)
The section ends with an interesting look at various gaps and islands problems i.e. finding gaps in a sequence. If you know the work of Ben-Gan you’ll be familiar with his use of puzzles in solving interesting problems.
The chapter continues with a look at pivoting, this involves rotating rows to columns. Additionally, details are provided to unpivot, where columns are converted to rows. As always, useful code is provided, output examined, performance impact considered, all in an easily digested read.
The chapter next looks at various methods that provide custom aggregations. Methods discussed include using a cursor, using FOR XML, and using string concatenation.
The chapter ends with a look at grouping. This section examines grouping sets (provides multiple grouping in one query), CUBE, ROLLUP, grouping set algebra, and dynamic ORDER BY.
This chapter provided a very instructive discussion on the various windows functions, pivoting, and grouping. In all cases helpful code was provided and the output discussed. Performance concerns were highlighted where applicable.