Page 1 of 3
Author: Benjamin Nevarez
Publisher: McGraw-Hill Osborne
Audience: Intermediate to advanced DBAs and developers
Reviewer: Ian Stirk
This book aims to give you the tools and knowledge to get peak performance from your queries, how does it fare?
The book covers query tuning and optimization, aiming to give you the tools and knowledge to get peak performance from your queries and applications. The optimizer creates an execution plan, detailing how the query will be fulfilled, the quality of this plan depends on various factors (e.g. database design, indexes available etc). This book looks at the factors that affect the plan, with a view to improving query performance.
The book is aimed at SQL Server professionals, database developers, DBAs, and data architects. It assumes some knowledge of SQL Server, and a familiarity with the SQL language.
Below is a chapter-by-chapter exploration of the topics covered.
Chapter 1 An Introduction to Query Tuning and Optimization
The chapter opens with the idea that the more you know about how SQL Server works, the more you’re in a better position to understand and fix SQL problems.
It continues with a brief look at the major architectural components: the storage engine (controls getting data, concurrency, and integrity), and the relational engine (aka the query processor, this creates a plan, executes the query and returns the results). This chapter provides an overview on how the query processor works.
The query processing process is broken down and its steps discussed, namely:
Parsing and binding (parsing ensures SQL syntax is valid, binding is mostly name resolution)
Generating candidate execution plans (want good-enough plan. often potentially lots of plans, uses rules and heuristics, there is limited optimization time)
Assessing the cost of each plan (cost of physical operator [CPU, IO etc] together with estimated rows [cardinality] reflects total cost)
Query execution and plan caching (execution plan stored in plan cache. Cache checked first to see if plan exists, since optimization can be relatively expensive)
The chapter next focuses on execution plans, their components, and how to read them. They are the primary method of interacting with the query processor. The different formats of the plans are discussed, namely: graphical, XML and text – it is noted the latter will be deprecated. Plans often contain warning messages (e.g. NoJoinPredicate), which are useful indications of perhaps a problematic or suboptimal plan. The ability to obtain the plan from a trace or the plan cache is discussed together with the relevant tool usage. Finally, the use of SET STATISTICS TIME and IO are discussed with examples, providing another useful tuning tool.
This chapter provides a very good overview of how the query processor works (parsing, binding, optimization, execution), as well as providing a useful overview of concepts that are used in the remainder of the book. There are useful diagrams showing the query processing, and compilation and recompilation processes. There’s some useful code for creating an Extended Events (XE) session for recording plan information.
There’s a useful tip about examining the schema associated with the XML plan to discover any new plan elements (e.g. StatementOptmEarlyAbortReason, NonParallelPlanReason). There is good example code to illustrate the concepts discussed, good links to related chapters, and links to website for further information – is it is throughout the book.
Chapter 2 Troubleshooting Queries
This chapter builds on the tools discussed in the first chapter, giving you additional tools and techniques to monitor, troubleshoot, and tune problem queries.
The chapter opens with a more detailed look at several useful Dynamic Management Views (DMVs) and Functions (DMFs), specifically:
sys.dm_exec_sessions (shows accumulative resource usage for session)
sys.dm_exec_requests (shows current resource usage of running SQL)
sys.dm_exec_query_stats (shows historical accumulative resource usage)
Useful code is provided to find the most expensive individual queries, as defined by average-CPU use and total_worker_time. Although SQL Trace will be deprecated, it’s still widely used and is very familiar tool. A section briefly discusses setting up monitoring using the SQL trace.
The next section discusses Extended Events (XEs), these are the preferred lightweight monitoring tool. Events, predicates, actions, and targets are briefly explained. Some very useful code is provided that converts SQL Trace events to XEs (sys.trace_xe_event_map joined to sys.trace_events). This is followed by a walkthrough that sets up an XE session using the New Session Wizard GUI. Data can be examined in real-time using the Watch Live Data facility, XQuery code is also provided to extract data from the XE session. Code to find the wait states associated with a given query is also provided.
The chapter ends with a look at the Data Collector. This is typically used for long running monitoring. A walkthrough is provided showing how it is set up, and how to use its reports (for %CPU, memory, disk I/O, and network usage).
This was a very helpful chapter, discussing various tools (DMVs, SQL Trace, XEs, Data Collector), showing how and why each should be used. The mapping of SQL Trace events to XE events should prove useful to database professionals that are new to XEs.
Some immediately useful code was provided, including identifying the most expensive queries, and finding the wait states for a given query – however the reason for using the latter was not provided.
Chapter 3 The Query Optimizer
This chapter expands on the query optimizer steps discussed earlier. The optimizer generates various execution plans, within the limited time, and selects the plan with the lowest cost.
In overview, the steps can be outlined as:
- trivial plan
- stats loading
- heuristic join reordering
- exploration stage 0
- exploration stage 1
- exploration stage 2
- convert to executable plan
The first 4 steps do not need access to the database content.
The DMV sys.dm_exec_query_optimizer_info is discussed, it contains useful information about the various stages of the optimization process. Although accumulative, with careful manipulation, it can be used with single queries or workloads. The specific optimizer steps are discussed next.
Parsing checks the SQL syntax, not the objects. Binding ensures the objects/columns exist, together with appropriate permissions. The output of this step is an algebrizer tree, which contains logical operators that relates to the underlying SQL. The trees are not documented, but it is possible to get some understanding via the DMV sys.dm_xe_map_values and trace flags – examples are given.
Simplification involves making the trees simpler, enabling easier optimization. Steps include: converting subqueries to joins, removing redundant joins, and contradiction detection. Useful example code and discussion is provided for each.
The purpose of the trivial plan step is to identify any obvious best plan, if there is, the optimization process can be terminated. The StatementOptmLevel element of the plan shows if a trivial plan has been used.
The optimizer uses transformation rules, based on relational algebra, to generate alternative but equivalent plans, some of which are cheaper. These alternatives are stored in the Memo structure. The DMV sys.dm_exec_query_transformation_stats gives some information on the various rules. It’s possible to enable/disable transformation rules globally via undocumented DBCC RULEON/RULEOFF, and at the statement level using OPTION QUERYRULEON/OFF -examples are provided of their usage and their impact on plans. The optimizer needs statistics for cardinality estimates, undocumented trace flags 9292 and 9204 can be used to discover what statistics are loaded.
If a trivial plan is not created, then full optimization is undertaken. This has 3 stages, with different rules applied at each stage. Since many potential plans can be generated, heuristics are used to reduce the potential number. If a good enough plan is found at any stage, the optimization process is terminated, and the best plan selected. The reason for exiting the optimization process is recorded in the StatementOptmEarlyAbortReason element of the plan. The stages are:
Search 0 (tries to find plan quickly without complex transformations. Ideal for small queries)
Search 1 (called quick plan. Limited join reordering, can look if parallel query better)
Search 2 (called full optimization. For complex to very complex queries. Larger set of transformations rules, advanced optimization strategies. Plan must be found here)
This was an interesting chapter, containing lots of insight (and some inference) on how the optimizer works. Many undocumented trace flags and options are given to gain a greater insight into the optimization process. The author rightly warns these should be used for educational purposes.
This chapter does a good job of explaining the various steps in the optimization process. Useful code is provided to support the assertions made. There’s a useful diagram showing the optimization process. The use of DMV snapshots to make inferences is a useful technique. The example code relating to contradiction detection and foreign key join elimination were particularly insightful.
Part of the problem with this chapter is the optimization process is only partly documented – so the chapter has the do the best with what is available, and does an admirable job.