Microsoft SQL Server 2014 Query Tuning & Optimization |
Page 3 of 3
Author: Benjamin Nevarez Publisher: McGraw-Hill Osborne ISBN: 978-0071829427 Print: 0071829423 Kindle: B00N9IC9PY Audience: Intermediate to advanced DBAs and developers Chapter 8 Plan Caching After going through the optimization process, the optimizer produces a plan. Creating this plan can be expensive, so it makes sense to reuse a plan where possible. This chapter discusses plan caching, plan reuse and problems associated with plan reuse. The plans are stored in the plan cache. The chapter opens with a look at the steps in the compilation and recompilation process. There’s a good diagram of the compile and recompile process. Performance Monitor (perfMon) can be used to determine the number of recompilations, if there are too many, SQL trace or XEs can be used to find their causes – a useful short example of this is provided. The plan cache can be explored using the DMV sys.dm_exec_query_stats, having accumulated IO, CPU, etc. While this DMV has limitations, it tends to be quick and easy to use. The use of the DMV sys.dm_exec_cached_plans is also discussed. The section ends with a review of the methods of removing plans from the cache, these include: clearing all of the cache, the cache for a given database, and for a given query. The chapter moves on to discussing parameterization. Queries that are the same, typically reuse the same plan, however SQL Server tends to be quite conservative when the SQL differs slightly. It is possible to encourage/ force SQL Server to relax these rules and thus reuse the same plans for related queries. The section discusses these methods including: autoparameterization, the “optimize for ad hoc workloads” setting, forced parameterization, and use of stored procedures. The plan is typically created based on the value of the parameters used, this is called parameter sniffing, and is generally a good thing. If these values are typical, then good reuse should occur. However, if the values are atypical, subsequent runs can be slow. Various methods of overcoming the problems associated with parameter sniffing are discussed including:
This chapter provided a good overview of how a plan is cached and reused, how it can be encouraged to be reused, the problems parameter sniffing can cause and options for how it can be resolved. There’s a useful diagram on the compile and recompile process, and various decision steps. Chapter 9 Data Warehouses Most of what’s been discussed in previous chapters is applicable to both OLTP and Data Warehouse systems. This chapter now concentrates on optimizations that relate to Data Warehouse systems. The chapter opens with a comparison between OLTP and Data Warehouse systems. Data Warehouse systems typically use historical data, with complex long running queries (often performing aggregations). The database is typically denormalized and often more suitable for ad hoc queries. The discussion continues with reference to star schema, fact and dimension tables, and additional normalization giving a snowflake schema. Big Data is briefly mentioned, with reference to Microsoft’s products (Parallel Data Warehouse, PolyBase, and HDInsight). Star join queries join fact and dimension tables, typically aggregating and applying filters. Heuristics are used to identify the fact and dimension table allowing the query to be optimized. Bitmap filtering can be applied, for parallel plans, which does a semi-join early, allowing rows to be dropped before entering the parallel processing step. Examples of this are provided and related to the plan content. Columnstore indexes offer an order of magnitude of performance improvement. Introduced in SQL Server 2012, they are now updatable, and can involve a clustered index. Data is stored in columns instead of rows, and compressed to improve performance further. Performance benefits include reduced IO, segment elimination and batch mode processing. In batch mode processing, instead of processing/getting one row at a time (the default), the data is processed in batches of 1000 rows. In SQL Server 2012 only a few operators could take advantage of batch mode processing, this has been increased in SQL Server 2014. The plan shows both the estimated and actual mode used. SQL code is provided to create columnstore indexes using both clustered and non-clustered indexes. This chapter explains some data warehouse system specifics, notable batch mode processing and use of columnstore indexes. There’s a useful table comparing OLTP and data warehouse systems. There’s a helpful reference to big data, with reference to Microsoft products, together with a reference to Kimball’s standard data warehouse book. Chapter 10 Query Processor Limitations and Hints This chapter starts with a look at the history of optimizer research (starting in the 1970s), and then looks at how join order and query complexity affect the optimizer, together with its limitations. Finally the use of hints, and the various types of hints are discussed. Join order affects the amount of data flowing between operators. Joins can be redefined using relational mathematics, to give alternatives that give the same results. In particular, the left-deep and bushy trees are examined here. As the number of joins increases, so the number of potential plans increases dramatically. The optimizer needs to balance the cost of optimization time with the production of a quality plan. Heuristics are used to reduce the number of potential plans. Often it’s possible to break down a complex query to produce a better plan. Having multiple, but simpler queries, gives the optimizer more options, also having intermediary temporary tables allows up to date statistics to be created and used in the plan. Breaking a query down into more steps is preferable to using hints. Hints should only be used as a last resort since they limit the number of transformations that can be applied, reducing the optimizer’s functionality. Hints need more maintenance, and their impact may change with each version or service pack of SQL Server. Before hints are used, ensure the problem doesn’t relate to IO or cardinality estimates (ways of updating the stats have been discussed earlier). The last section discusses, with examples, the various types of hints, including: joins, aggregations, force order, indexseek, forcescan, forseek, fast n, and noexpand. Lastly the creation and use of plan guides is examined. This chapter provided some interesting reading, especially the section on breaking down a query (there’s a link to the author’s paper on the same topic provided). The comment about using hints as a last resort is useful. The chapter includes helpful sample code that illustrates what happens when you use a particular hint.
Appendices The appendices contain sections relating to: References, White papers, Articles, Research papers and Books. After reading and digesting a worthy tome, many readers wonder where they should go next to get more information about the subject matter - this section of the book provides the answers. I enjoyed this section of the book, I wish other books contained something similar. There are some great resources listed for investigating topics further. It might have been useful for the author to comment briefly why he included the named articles etc, and perhaps include them at the end of the relevant chapter. Conclusion This book aims to give you the tools and knowledge to get peak performance from your queries and applications. It differs from other performance related books by concentrating on the factors that influence the production of a good execution plan, and thus improve performance. The book is generally easy to read, this is quite an achievement since some of the concepts are deep and undocumented. There are good links between the chapters, helpful website links for further information, good diagrams to support the discussions, and helpful example code to support the assertions made. The appendices should prove very useful. I enjoyed the author’s pragmatic approach e.g. use hints as a last resort, and always test recommendations on your own system. It might be argued that the book is not as diverse as it could be, since it doesn’t cover all aspects of performance e.g. RAID levels, SQLDIAG, wait stats and queues. I think this misses the point, the book concentrates on those factors that influence the quality of the execution plan. Besides, the additional references provided will cover these other areas. The book is suitable for intermediate to advanced level DBAs/developers, and should take your level of understanding of performance tuning in general, and the optimizer in particular, from level 5 to 8 or 9. Although the book is primarily for SQL Server 2014, it typically indicates in which version of SQL Server a given feature was introduced, so should prove useful in SQL Server 2012 and 2008. I enjoyed reading this book. If you spend time troubleshooting performance problems, or you want a deeper understanding of the optimizer and the factors that influence it, I can heartily recommend this excellent book.
|
|||||||
Last Updated ( Tuesday, 16 December 2014 ) |