Expert SQL Server In-Memory OLTP |
Page 2 of 2
Author: Dmitri Korotkevitch Chapter 6 In-Memory OLTP Programmability This chapter opens with a look at native compilation. In-Memory tables can be used by interpreted SQL via interop, often giving a 2 to 4 times performance improvement compared with on-disk tables. Interpreted SQL often requires thousands of CPU instructions, while natively compiled SQL reduces this significantly. Natively compiled stored procedures cannot be altered, instead they need to be dropped and recreated. A useful diagram describes the compilation process. Native code is very efficient and a major performance improvement when accessing In-Memory tables. Example code is provided for creating natively compiled stored procedures, and the syntax options discussed. Additionally, the main T-SQL features permitted are discussed (some features are currently not permitted). Parameter sniffing is absent, instead optimize for unknown is used. There is a reminded that if the data changes significantly, you should update the statistics, drop and recreate the stored procedure. The chapter continues with a look at how interpreted SQL uses In-Memory tables, they behave in a similar manner to on-disk tables, and most T-SQL is supported – some exceptions are listed. A useful comparison section follows, with code examples illustrating how inserts/deletes/updates/selects compare when running as interpreted or native code. It’s noted that no parallel plans are created, and thus In-Memory OLTP may be less useful in Data Warehousing which requires large scans and complex aggregations. The chapter ends with a look at In-Memory table types and variables. These are stored in memory, not tempdb, and they can be used as a replacement for table variables and temp tables. This chapter provides useful discussions concerning native compilation, how it can be used, its advantages, and limitations. Example code that compares performance is especially useful. Chapter 7 Transaction Processing in In-Memory OLTP This chapter opens with a look at what a transaction is, and then describes its attributes: Atomic, Consistent, Isolated, and Durable (ACID). The transaction isolation level can be changed to alter the concurrency, but this can also introduce concerns, including: dirty reads, non-repeatable reads, and phantom reads – each of these is discussed. On-disk processing uses locking to address concurrency problems (unless snapshot is used), this is a pessimistic concurrency model. In-memory uses row versioning and an optimistic concurrency model – performing validation when trying to commit data. Helpful example code is provided that shows the impact of the isolation level on concurrency. The chapter continues with a look at cross-container transactions, these are used by interpreted SQL to access In-Memory tables. It’s possible to have different isolation levels for on-disk and In-Memory tables – but not all combinations are supported. Various code examples are provided. There’s a useful general rule provided: where possible use read committed/snapshot combinations in cross-container transactions for regular work. The chapter ends with a look at the transaction lifetime, discussing important In-Memory elements, including: Global Transaction Timestamp, TransactionId, and BeginTs/EndTs. A useful diagram showing transaction lifetime under various conditions is given (e.g. data modification, successful commit, and Validation error/rollback). This chapter provides a useful discussion of transactions processing in In-Memory OLTP. Helpful diagrams and code examples are provided to support the text. Chapter 8 Data Storage, Logging, and Recovery This chapter opens with a look at the durability option of In-Memory tables. This ensures changes are logged to disk, allowing for recovery. While on-disk tables contain only the latest data, In-Memory tables can contain multiple rows, changes are written to Checkpoint File Pairs (CFP). CFP consist of a data file and delta file, covering a range of Global Transaction Timestamp values, inserts go into the data file and deletes to the delta file – remember an update is an insert and a delete. Using a separate delta file for deletes avoid random IO, since both files are append only, this improves performance. The various states of the CFP are discussed. The chapter continues with a look at transaction logging. Logging of In-Memory tables is much more efficient than on-disk logging. Index changes are not logged – since indexes are recreated when the table is recovered. Changes are logged at commit time, so uncommitted data is not logged, also no undo information is written. There are useful examples showing log content when inserting data into on-disk and an In-Memory table via the undocumented sys.fn_dblog function. The chapter ends with a look at recovery. The recovery process is described, the In-Memory OLTP engine uses the most recent checkpoint inventory to get a list of the active CFPs and loads data from them, before applying any changes in the tail of the transaction log. This chapter provides a good discussion of the recovery process, including the various steps. Useful examples show the log content via the undocumented sys.fn_dblog function. Chapter 9 Garbage Collection This chapter opens with an overview of the Garbage Collection process. Updates create a new row rather than update data, deletes mark a row as deleted via the EndTs column. All these processes leave data that needs to be removed – this is done via Garbage Collector. The Garbage Collector analyzes a list of completed transactions, and with the Oldest Active Transaction Timestamp determines which data can be collected. A helpful Garbage Collection workflow diagram is provided. The chapter ends with a look at the various DMVs that can be used to investigate the Garbage Collection process, and example code provided. The DMVs discussed are:
This chapter provides an interesting discussion about the Garbage Collection process. Useful example code is provided to investigate the process. I note DMVs are described throughout the book as Data Management Views instead of Dynamic Management Views. Chapter 10 Deployment and Management This chapter opens with a look at hardware and how it impacts In-Memory OLTP. In-Memory OLTP typically uses hardware more efficiently, so it might be feasible to use mid-level servers. It’s also highly scalable, so adding CPUs, memory, and IO drives can have a noticeable impact on performance. The impact of increasing CPUs, memory, and the IO subsystem are all discussed. The chapter continues with a look at administration and monitoring. It’s important to remember that In-Memory tables need to share resources with other SQL Server components. Resources can be allocated via the Resource Governor, and examples code for this is given and discussed. Monitoring is largely done via DMVs, and again sample code is discussed. The chapter ends with a look at the various new and amended system objects that can reports on In-Memory OLTP structures. These include: catalog views, DMVs, Extended Events, and Performance Counters. This chapter provides a useful discussion on how hardware impacts In-Memory OLTP. Additionally, admin and monitoring features are discussed. I’m not convinced deployment is discussed. Useful code is provided to identify hash indexes with suboptimal bucket counts. Code is supplied to identify the oldest active transactions, these are often the cause of problems e.g. preventing the Garbage Collector working – however, adding the transaction start time (or calculating the duration) would increase the code’s usefulness. Chapter 11 Utilizing In-Memory OLTP In many ways, this is the most important chapter of the book, all the previous chapters have been building towards it, using experience to provide best practices and recommendations. The chapter opens with a look at design considerations for using In-Memory OLTP. Existing systems may need to undergo changes before migrating to In-Memory OLTP. Cost/benefit analysis should be performed, together with adequate testing. It’s noted that other options, instead of moving to In-Memory OLTP, can be considered, including: columnstore indexes, indexes views, and compression. The importance of monitoring is noted and example code to assist with this is provided, including: memory usage, analyze data and recreate hash indexes if bucket counts need to be adjusted, update statistics, and redeploy natively compiled stored procedures. Some tools that help with migration to In-Memory OLTP are included in SQL Server, and these are discussed in an appendix. Various limitations of In-Memory OLTP are listed, and some solutions are given, with code examples, including:
There’s an interesting section on using In-Memory OLTP for non-obvious use cases, these include: processing batches of data, replacing temp and staging tables, and using In-Memory OLTP for session state. The chapter ends with a discussion about using In-Memory OLTP with mixed workloads. It’s suggested to split a table into 2, one table containing the latest ‘hot’ data stored In-Memory, and another on-disk table containing the historic data. The data tables can be combined in a view to hide the storage detail from the client applications. This approach also allows differing index strategies to be applied. Code examples are provided throughout. This chapter provides a very useful insight into using In-Memory OLTP in a practical manner. The section on overcoming limitations and using In-Memory OLTP in mixed workloads are particularly useful. Appendices
The last appendix is especially useful, providing a step-by-step walkthrough on installing the Management Data Warehouse, collecting and analysing metrics, using the various included tools to identify the tables and stored procedures that would benefit most from migration to In-Memory OLTP. Conclusion This book has well-written discussions, helpful examples, diagrams, website links, inter-chapter links, and useful chapter summaries. It contains plenty of practical code to help you understand the subject matter. In-Memory OLTP is the primary reason to move to SQL Server 2014, it typically provides much faster processing. The book has a good flow between chapters, discussing what In-Memory OLTP is and how to implement it, culminating in a collection of impressive recommendations and best practices. Any criticisms I have are exceedingly minor. The book contains around 50% overlap with the author’s broader Pro SQL Server Internals book, see my review, which for me, that is the best SQL Server internals book I’ve read. If you want a practical discussion on In-Memory OLTP, including recommendations and best practices, I can highly recommend this erudite book.
|
||||||
Last Updated ( Tuesday, 01 May 2018 ) |