Microsoft SQL Server 2014 Query Tuning & Optimization
Article Index
Microsoft SQL Server 2014 Query Tuning & Optimization
Chapters 4 -7
Chapters 8 - 10, Conclusion

Chapter 4 Query Operators

The query operators in the plan perform the actual work, while there are many operators, here the discussion concentrates on the major ones: data access, aggregation, joins, parallelism, and updates.

The chapter opens with a look at data access operators, these access tables or indexes. The specific operators discussed, with examples, are: 

  • Scans (read the entire structure)

  • Seeks (looks for one or a few rows)

  • Bookmark lookups (data not in index, so goes to underlying clustered index row)

Next aggregates, which summarise information, are examined. There are 2 types, stream aggregates and hash aggregates. The specific operators discussed, with examples, are: 

  • Sorting and hashing (both build hash table, sorting can spill to tempdb)

  • Stream aggregate (from plan, stream aggregation of AVG = COUNT / SUM, checking for 0)

  • Hash aggregate (wants sorted input. big tables, not sorted.)

  • Distinct sort (distinct can be rewritten as a GROUP BY) 

The section on joins discusses the 3 join types, and states that no one type is best – but depends on the particular circumstances. The specific operators discussed, with examples, are: 

  • Nested loop (small input, for each outer row loop around all rows in inner table. So cost proportional to size of outer input multiplied by size of inner input)

  • Merge (wants sorted input, can sort if cheaper, eqi-join only. Max cost is sum of both inputs)

  • Hash (similar to merge join but input don’t need to be sorted. Cost is sum of both inputs) 

Next parallelism is examined. If there are at least 2 CPUs, running the query with parallelism might be cheaper. The “max degree of parallelism” and “cost threshold for parallelism” server configuration options are discussed. Additionally, parallelism can be restricted/increased at the query level using the MAXDOP hint. Useful examples are provided.

The last section looks at the update process, this may involve: updating indexes, validating constraints/RI, running trigger, and running cascading updates. The process can become quite complex as the number of tables increases. Per-row and per-index plans are discussed, together with the how the Halloween problem is overcome.

This was another insightful chapter, concentrating on the major operators that perform the actual work. This chapter should prove useful in deciphering your own plans. There’s some instructive code showing how similar SQL can produce differing plans based on row count estimates. One example shows either a nested loop join or a merge join being used, another example shows a key lookup or a clustered index scan being used – based on the row count estimates of the supplied input.

It’s instructive to see how indexes and row counts can influence the operators used, showing the importance of indexes and statistics – these are discussed in the next 2 chapters.

Chapter 5 Indexes

Indexes are often the primary means of improving query performance. This chapter discusses the different types of indexes, their structure, how they affect performance, together with missing indexes, and the effect of index fragmentation.

The chapter opens with a summary of the different types of indexes. Heaps, clustered indexes, non-clustered indexes, unique indexes, and primary keys are discussed with useful examples. The properties of clustered indexes (i.e. unique, narrow, static, and increasing) are discussed with examples. Both covered and filtered indexes are discussed.

Next, index operations are examined, including navigating through B-trees to find data via a scan or seek. Various predicates are examined (e.g. =, <, > IN, BETWEEN, !=), followed by a helpful discussion showing why an index might not be used.

The Database Engine Tuning Advisor (DTA) is discussed next. This tool takes a query or workload as its input, and determines which indexes should be created or removed, to enhance the query’s performance. The examples show how the plan cache can be used (in addition to a file or table), to identify which indexes should be created/removed. An interesting section on using the DTA to offload the tuning overhead from a production server to a test server is discussed.

Next, missing indexes are examined, details of these are included within individual execution plans, and also accumulatively via the missing index DMVs. There are some limitations, and these are discussed in a website link provided.

Index fragmentation occurs when the logical row order doesn’t match the physical row order, it can be detrimental to the performance of scans. The DMF sys.dm_db_index_physical_stats can be used to determine the amount of defragmentation, typically indexes with >30% defragmentation are rebuilt, and those between 10% and 30% are reorganised. Lastly, code is provided to identify unused indexes by querying the DMV sys.dm_db_index_usage_stats, these can be considered for removal since while they are not used for retrieval they are likely to degrade the performance of updates.

This was a more traditional chapter, detailing information that’s mostly available elsewhere, but needed here for completeness, besides it very well written. The author makes the valid point of ensuring you test any recommendation on your own system instead of blindly applying them. The section using a test server to tune a production workload was very useful.

I was surprised the section on unused indexes didn’t warn readers that the index might be needed quarterly etc, so ensure further analysis is done before you remove indexes.

Chapter 6 Statistics

Database statistics contain details about the distribution and density of column values, they are used as input to the optimizer to provide a cardinality estimate (number of rows), and thus affect the plan cost. It is very important the statistics are up to date. The chapter opens with a discussion about the importance of having useful statistics to help the optimizer produce a good plan.

The section on creating and updating statistics shows the importance of keeping the default database configuration options for auto_create_stats and auto_update_stats. It is noted that index rebuilds result in up to date statistics with 100% sampling. Statistics are typically automatically updated after a given threshold is exceeded (e.g. ~20% of the underlying data changes), however this can cause problems for newer data, it is possible to use trace flag 2371 to lower this threshold.

There’s a useful section on inspecting stats objects using the DBCC SHOW_STATISTICS command. The objects contain 3 pieces of information: density, histogram and string statistics, which are examined in detail, and examples relate it to the cardinality estimate in the plan.

SQL Server 2014 includes a new cardinality estimator that can give better performance, especially if the columns are dependent. If performance degrades, it is possible to use the previous cardinality estimator – examples are provided of using both.

Incremental statistics, filtered statistics, statistics on ascending columns, and statistics on computing columns are all described with helpful examples. Perhaps one of these will solve an unusual performance problem you might have.

Since statistics are vitally important in creating a good plan, it is important that they are kept up to date. The default automatic updating of statistics may not be adequate. It is possible to update your statistics more regularly, and provide a bigger sample.

This chapter provided a comprehensive discussion concerning all aspects of database statistics. In addition to being wide ranging, useful example code is provided throughout to illustrate the concepts. The detailed example showing how the cardinality estimate is derived from the statistics was very useful. It might have been helpful to have provided a routine that shows the current state of the statistics.

Chapter 7 In-Memory OLTP aka Hekaton

This chapter starts with a review of the history of database architecture. Initially, memory was expensive, so relatively small amount of RAM were used, and data persisted to disk. However, memory is now much cheaper, meaning this model can be changed, enabling the use of more RAM – this is especially useful for OLTP systems, which can typically fit into memory.

In-Memory OLTP is optimized for memory, and relates largely to compiled in-memory tables and indexes, compiled stored procedures, and a lock and latch free concurrency model – all of which can increase performance significantly. Unlike other vendors, Microsoft’s In-Memory OLTP technology is fully integrated into the existing architecture (i.e. can use old and new together).

The chapter examines In-Memory OLTP tables and indexes. These can be accessed quickly via natively compiled stored procedures, and the more traditional interpreted SQL. The lock and latch free multiversion concurrency control (MVCC) model permits multiple versions of a row to exist, but only one is eventually valid, the rest will be removed via the garbage collector. Indexes are never stored on disk or logged, and are rebuilt after a restart.

The chapter continues with example code to create an In-Memory OLTP filegroup, tables and indexes. The data can be persisted to disk (this is the default), for recoverability purposes, via the schema_and_data durability setting. In-Memory OLTP tables and natively compiled stored procedures have limitations, these are briefly discussed. The structure of the row is briefly discussed, being a row header and payload. There are 2 types of index, hash indexes and range indexes.

Hash indexes are used in equality predicates and lookups, you need to supply all the indexes’ columns to use this index. Hash indexes have a bucket_count that pre-allocates the number of buckets for the hashes – it is important to get this value correct, too few will mean duplicates and too many will waste space. Information about hash indexes can be obtained from the DMV sys.dm_db_xtp_hash_index_stats.

Range indexes are used in equality or inequality predicates, they can return sorted data, and are typically better for inequality searches and scans. It is possible to search using some of the (leading) columns of the index. A new Bw-tree structure is used to navigate the index. Useful example code shows how index access is linked to the plan content.

Natively compiled stored procedures can only access In-Memory OLTP tables. They are compiled, and use fewer CPU instructions, giving improved performance. The syntax for creating natively compiled stored procedures is discussed, and the impact of isolation level is explained. Statistics on In-Memory OLTP tables/indexes are not automatically updated, to take advantage of new data the statistics need to be updated and the stored procedures dropped and recreated. Various limitations of tables and natively compiled stored procedures are described.

The chapter ends with a brief look at the Analysis, Migrate, and Report (AMR) tool. This tool uses a recorded workload, to determine which tables and stored procedures are the best ones to migrate. There’s a useful walkthrough on how to use the AMR tool.

This chapter provides useful information about the In-Memory OLTP engine, which is probably the main reason to migrate to SQL Server 2014 - since it potentially offers big performance gains. There are some useful diagrams included e.g. index structures.

Last Updated ( Tuesday, 16 December 2014 )