SQL Server Query Tuning and Optimization (Packt)
Article Index
SQL Server Query Tuning and Optimization (Packt)
Chapters 2 - 6
Chapters 7 - 9
Chapters 10 - 12, Conclusion

Author: Benjamin Nevarez
Publisher: Packt Publishing Pages: 446
ISBN: 9781803242620
Print: 1803242620
Kindle: B0B42SVBFY
Audience: Intermediate to advanced DBAs and developers
Rating: 4.7
Reviewer: Ian Stirk 

This book aims to give you the tools and knowledge to get peak performance from your queries, how does it fare?

The book, which is largely an update to the author’s Microsoft SQL Server 2014 Query Tuning & Optimization published by McGraw-Hill, with two new chapters, covers query tuning and optimization, aiming to give you the tools and knowledge to get peak performance from your queries and applications. The optimizer creates an execution plan, detailing how the query will be fulfilled, the quality of the plan often depends on the data given to it (e.g. database design, indexes available etc). This book looks at these factors with a view to improving query performance.

The book is aimed at SQL Server professionals, database developers, DBAs, and data architects. It assumes some knowledge of SQL Server, and a familiarity with the SQL language. It covers all supported versions of SQL Server, including SQL Server 2022.

Below is a chapter-by-chapter exploration of the topics covered.

Banner

Chapter 1 An Introduction to Query Tuning and Optimization 

The chapter opens with the idea that the more you know about how SQL Server works, the more you’re in a better position to fix SQL problems. It continues with a brief look at the major architectural components: the storage engine (controls getting data, concurrency, and integrity), and the relational engine (aka the query processor, this creates a plan, executes the query and returns the results). This chapter looks at how the query processor works.

The query processing process is broken down and its steps discussed, namely: 

  • Parsing and binding (parsing ensure SQL syntax is valid, binding is mostly name resolution)

  • Query optimization:

  • Generating candidate execution plans (want good-enough plan. often potentially lots of plans, uses rules and heuristics, need to limit optimization time)

  • Assessing the cost of each plan (cost of physical operator [CPU, IO etc] together with estimated rows [cardinality] reflects total cost)

  • Query execution and plan caching (execution plan stored in plan cache. Cache checked first to see if plan exists, since optimization can be relatively expensive) 

The chapter next focuses on execution plans, their components, and how to read them. They are the primary method of interacting with the query processor. The different formats of the plans are discussed, namely: graphical, XML and text – it is noted the latter being deprecated. Various plan attributes are discussed (e.g. STATEMENTOPTMLEVEL). Plans often contain warning messages (e.g. NoJoinPredicate), which are useful indications of perhaps a problematic or suboptimal plan. The ability to obtain the plan from a trace or the plan cache is discussed together with the relevant tool usage. Finally, the use of SET STATISTICS TIME and IO are discussed with examples, providing another useful tuning method.

This chapter provides a very good overview of how the query processor works (parsing, binding, optimization, execution), as well as providing a useful overview of concepts that are used in the remainder of the book. There are useful diagrams showing the query processing, and compilation and recompilation processes. There’s some useful code for creating an Extended Events (XE) session for recording showplan information.

The chapter does have some shortcomings. Firstly, it should have been updated to refer to the execution plans held in Query Store. Secondly, in obtaining the top 10 most expensive queries by CPU usage, the author sorts by average CPU – this is wrong, you need to sort by total_worker_Time. This is the second consecutive book review where I’ve seen this error. The explanation for my correction is: if a query takes 20 mins of CPU to run, and runs twice, its average CPU usage is 20 mins, and its total CPU usage is 40 mins, if another query takes 2 mins of CPU to run, and is run 1000 times, its average CPU usage is 2 mins, and its total CPU usage is 2,000 mins. If you sort the results by average CPU usage, the 20min query appears top, even though it is not the query that is using the most CPU. 

There’s a useful tip about examining the schema associated with the XML plan to discover any additional plan properties (e.g. StatementOptmEarlyAbortReason, NonParallelPlanReason). There is good example code to illustrate the concepts discussed, good links to related chapters, and links to website for further information – these traits apply to the whole book. 



Last Updated ( Wednesday, 23 November 2022 )