Expert Performance Indexing in SQL Server 2019 |
Page 1 of 2 Author: Jason Strate This book discusses indexes, a primary means of improving performance in SQL Server, how does it fare? Indexes are typically a dry subject matter, however they are fundamental to both understanding the structure of data and helping improve query performance. I did wonder why anyone would want to write a book solely about indexes. After all, if you wanted to know about a given make of car, would you buy a book that’s only about the car’s engine? The answer I think, from a performance perspective, is yes. Before reading the book, I had a list of questions that I hoped would be covered, this would let me know how detailed the book was. These included:
It successfully answered most of my questions – showing me the book had both depth and quality. The book, the 2rd edition of the title, is an update on the 2014 edition, with there being three new versions of SQL Server since that time (i.e. SQL Server 2016, 2017, 2019). Below is a chapter-by-chapter exploration of the topics covered. Chapter 1 Index Fundamentals The chapter provides a comprehensive overview of what indexes are, the different types, and why they are important. The types of index discussed include heaps (sic), clustered, non-clustered, columnstore, xml, spatial, hash, range, and full-text search. Also covered are primary keys, unique indexes, included columns, partitioned indexes, filtered indexes, and compression. The various options used for creating an index are covered in good detail. These options include fillfactor, pad_index, sort_in_tempdb, ignore_dup_key, statisitics_norecompute, drop_existing, online, allow_row_locks, allow_page_locks, maxdop, data_compression, on partitions. A brief overview of the system tables that contain index metadata is provided, including: sys.indexes, sys.index_columns, sys.xml_indexes, sys.spatial_indexes, sys.column_store_dictionaries, and sys.column_store_segments. In SQL Server 2019, index rebuilds are resumable, detail on this are described briefly. The Table of Contents lists sys.selective_xml:index_paths and sys.selective_xml:index_namespaces - which are nonsense, helpfully the underlying chapter correctly uses SYS.SELECTIVE_XML_INDEX_PATHS and SYS.SELECTIVE_XML_INDEX_NAMESPACES. Useful discussions, diagrams, practical example code, and website links are given throughout. These traits apply to the whole of the book. Chapter 2 Index Storage Fundamentals There is great detail in this chapter, probable more than you would need to know. Parts of it are academic, however it does give you a detailed understanding of how data is physically stored and related. Rows, pages and extents are discussed, before delving deep into the various page types. The different types of page tie the data structures together, and physically implement the database. Page types covered are: File header, boot, PFS, GAM, SGAM, DCM, BCM, IAM, Data, Index, and LOB (text/image). Index organization in terms of heaps, b-trees and columnar is described. Various DBCC commands are discussed, and example SQL code provided, to look at the content of individual physical pages. I’ve used this in the past when DBCC CHECKDB has reported corruption (unfortunately for me, the typical response of a database restore was not possible due to missing trans logs, but luckily the corruption was limited to a non-clustered index that could easily be rebuilt :-). The chapter continues with a discussion of a major cause of index degradation – fragmentation. It covers forwarded records (for heaps only) and page splits. Chapter 3 Index Metadata and Statistics The chapter opens with a look at index statistics i.e. information about the density and distribution of column values in an index, these are used by the optimizer to create an efficient access path to the data. The chapter provides a detailed look at what index statistics are and why they are important. Stale statistics can make SQL that should run in a few seconds take many minutes (or hours) to run. Helpfully, SQL is provided to examine the state of your index statistics. Usage stats in relation to the DMO (Dynamic Management Object) sys.dm_db_index_usage_stats are discussed next. This DMO contains metrics about how an index is being used (i.e. seek, scan, lookup), in terms of the number of batches that use this index (rather than the number of rows accessed via each method). It also provides associated last accessed dates for each index access method. This can be useful in determining how, when and if an index is used. Operational stats in relation to the DMO sys.dm_db_index_operational_stats are then discussed. This DMO contains metrics about how an index is being used (i.e. seek, scan, lookup), in terms of the number of rows accessed via each method. Additional useful columns relate to row/page locks, lock escalations, latch contention, and compression are discussed in detail. Finally, index metrics relating to space and fragmentation as discussed in the DMO sys.dm_db_index_physical_stats. Fragmentation can result in data taking longer to read since the data is not in the required logical/physical sequence. Fragmentation is typically removed during a maintenance period. While it is great having lots of details about the various DMOs, and how the indexes are being used, I wanted more information about how to interpret the data (some is given), and what to do next. Chapter 4 XML Indexes Chapter 5 Spatial Indexing Chapter 7 Full-Text Indexing These are atypical indexes, relating to unstructured data. I think they’re included for completeness rather than for detail (after all they contain the word ‘index’). That said, in each case a satisfactory overview is given with examples to get you started in your investigations. XML indexes allow you to query XML by storing path, tags, and value details in an index. Spatial data relates to the geometry and geography data types. Full-text indexing relates to indexing text based content/documents. In the book’s previous edition, Full-Text Indexing was Chapter 6 - it is now Chapter 7. This seems to be an error in editing, since it makes more sense to group the ‘peripheral’ index chapters together. Indeed the book’s introduction has grouped it contiguously. Chapter 6 Indexing Memory-Optimized Tables In-Memory tables are stored in memory, unlike their on-disk counterparts. They are the primary reason for upgrading to SQL Server 2014, providing significant performance improvements. The chapter describes what In-Memory tables are, and how to create them. The chapter next discusses the two indexing options associated with In-Memory tables: hash indexes are optimized for key lookups, and range indexes are useful for searching for a range of data. Example code is given to illustrate the usage of both types of index. In later editions of SQL Server (from 2014) the many restrictions associated with using In-Memory processing have been progressively removed. The chapter says there is a maximum limit of 8 indexes for in-memory tables, this is incorrect, starting from SQL Server 2017 there is no longer any limit. Similarly, it states in-memory indexes can’t be changed after they’re been created, but the ALTER TABLE statement can be used with in-memory tables to change these indexes. This chapter needed better updating. |
|||
Last Updated ( Tuesday, 14 January 2020 ) |