Expert SQL Server In-Memory OLTP (2nd Ed) |
Written by Ian Stirk | ||||
Page 2 of 3
Author: Dmitri Korotkevitch
Chapter 5 Nonclustered Indexes Nonclustered indexes, like their on-disk counterparts, are optimized for searching for a range of data values. The chapter has example code to create a nonclustered index, again the index is cooked into the table. Nonclustered indexes are good for index seeks that locate a subset of data. The SARGability rules are similar to on-disk indexes, however the index is unidirectional, so if you want to search an index in both directions, you need to create 2 indexes. Various example code is provided. The chapter continues with a look at internals. A variation on the traditional B-tree is used - the Bw-tree, and this is described. Index delta records are created for any inserts and deletes, an update creates 2 delta records – a delete and an insert. SQL Server needs to traverse and analyze all delta records when accessing an index page, and long chains can degrade performance. Delta records are periodically consolidated and new index pages created, and the old page is marked for collection by the Garbage Collector. The chapter next looks at how to obtain information about the nonclustered Indexes, this is primarily via the DMVs sys.dm_db_xtp_index_stats and sys.dm_db_xtp_nonclustered_index_stats, example code showing their usage is discussed. The chapter ends with a look at when to use hash or nonclustered indexes. Hash index are best for point lookups with equality predicates, and good if the workload and data are relatively static. Nonclustered indexes have much wider usage, as well as being useful for identifying a subset of rows, they may be useful if the hash has long chain due to an incorrect bucket_count – code is given that illustrates this. Overall, the author suggests nonclustered indexes may be a safer choice, especially if it is difficult to estimate the bucket count. This chapter provides a useful discussion about nonclustered indexes, describing their internal structure, how they’re created, its SARGability rules, how to obtain information about them, and how they compare to hash indexes.
Chapter 6 Memory Consumers and Off-Row Storage This chapter discusses how In-Memory OLTP allocates memory for various objects, and specifically how off-row column data is stored. The chapter opens with a look at varheaps, these are data structures that trace memory allocations from memory consumers (various database objects). The three common types of varheaps are discussed: hash index, range index, and table heap. Next, both in-row and off-row storage are discussed with example code. The chapter next considers the performance impact of off-row storage. Generally, off-row storage is fast in creating versioned rows, however it is relatively slow in performing modifications (i.e. inserts/updates) and selecting data. As a general rule, it is recommended to avoid off-row storage where possible.
Chapter 7 Columnstore Indexes While In-Memory OLTP technology is primarily targeted at OLTP environments, OLAP columnstore indexes can also take advantage of the technology. The chapter discusses how columnstore indexes store index data in a column-wise fashion instead of the more usual row-wise pattern, this allows considerable space saving with compression, given improved performance. Performance can be enhanced further if a limited number of columns are required by a SQL query. Next, the chapter looks at other factors that can impact performance. In particular the effect of the COMPRESSION_DELAY index option can significantly reduce the storage of old versions of data. The chapter briefly looks at some the limitations of columnstore indexes, including:
The chapter ends with a brief look at various catalog and DMVs that provide useful information about columnstore indexes. This chapter provides a useful look at improvements that can be applied to OLAP/data warehouse environments.
Chapter 8 Transaction Processing in In-Memory OLTP This chapter opens with a look at what a transaction is, and then describes its attributes: Atomic, Consistent, Isolated, and Durable (ACID). The transaction isolation level can be changed to alter the concurrency, but this can also introduce concerns, including: dirty reads, non-repeatable reads, and phantom reads – each of these is discussed. On-disk processing uses locking to address concurrency problems (unless snapshot is used), this is a pessimistic concurrency model. In-Memory uses row versioning and an optimistic concurrency model – performing validation when trying to commit data. Helpful example code is provided that shows the impact of the isolation level on concurrency. The chapter continues with a look at cross-container transactions, these are used by interpreted SQL to access In-Memory tables. It’s possible to have different isolation levels for on-disk and In-Memory tables – but not all combinations are supported. Various code examples are provided. There’s a useful general rule provided: where possible use read committed/snapshot combinations in cross-container transactions for regular work. The chapter ends with a look at the transaction lifetime, discussing important In-Memory elements, including: Global Transaction Timestamp, TransactionId, and BeginTs/EndTs. A useful diagram showing transaction lifetime under various conditions is given (e.g. data modification, successful commit, and Validation error/rollback). This chapter provides a useful discussion of transactions processing in In-Memory OLTP. Helpful diagrams and code examples are provided to support the text.
Chapter 9 In-Memory OLTP Programmability This chapter opens with a look at native compilation. In-Memory tables can be used by interpreted SQL via interop, often giving a 2 to 4 times performance improvement compared with on-disk tables. Interpreted SQL often requires thousands of CPU instructions, while natively compiled SQL reduces this number significantly. Natively compiled SQL routines (e.g. stored procedures) can now be altered, previously in SQL Server 2014 they needed to be dropped and recreated. A useful diagram describes the compilation process. Native code is very efficient and a major performance improvement when accessing In-Memory tables. Example code is provided for creating natively compiled stored procedures, and the syntax options discussed. Additionally, the main T-SQL features permitted are discussed (some features are currently not permitted). Parameter sniffing is absent, instead optimize for unknown is used. There is a reminded that if the data changes significantly, you should refresh the SQL routines. The chapter continues with a look at how interpreted SQL uses In-Memory tables, they behave in a similar manner to on-disk tables, and most T-SQL is supported – some exceptions are listed. A useful comparison section follows, with code examples illustrating how inserts/deletes/updates/selects compare when running as interpreted or native code. It’s noted that no parallel plans are created, and thus In-Memory OLTP may be less useful in Data Warehousing which requires large scans and complex aggregations. The chapter ends with a look at In-Memory table types and variables. These are stored in memory, not tempdb, and they can be used as a replacement for table variables and temp tables. This chapter provides useful discussions concerning native compilation, how it can be used, its advantages, and limitations. Example code that compares performance is especially useful. Chapter 10 Data Storage, Logging, and Recovery This chapter opens with a look at the durability option of In-Memory tables. This ensures changes are logged to disk, allowing for recovery. While on-disk tables contain only the latest data, In-Memory tables can contain multiple rows, changes are written to Checkpoint File Pairs (CFP). CFP consist of a data file and delta file, covering a range of Global Transaction Timestamp values, inserts go into the data file and deletes to the delta file – remember an update is an insert and a delete. Using a separate delta file for deletes avoid random IO, since both files are append only, this improves performance. The various states of the CFP are discussed. The chapter continues with a look at transaction logging. Logging of In-Memory tables is much more efficient than on-disk logging. Index changes are not logged – since indexes are recreated when the table is recovered. Changes are logged at commit time, so uncommitted data is not logged, also no undo information is written. There are useful examples showing log content when inserting data into on-disk and an In-Memory table via the undocumented sys.fn_dblog function. Next the chapter looks at recovery. The recovery process is described, the In-Memory OLTP engine uses the most recent checkpoint inventory to get a list of the active CFPs and loads data from them, before applying any changes in the tail of the transaction log. The chapter ends with a look at table alterations, and the impact of metadata changes versus regular table changes. It’s noted that SQL Server 2016 Service Pack 1 contains several useful performance enhancements for table alterations. This chapter provides a good discussion of the recovery process, including the various steps. Useful examples show the log content via the undocumented sys.fn_dblog function. |
||||
Last Updated ( Tuesday, 01 May 2018 ) |