Microsoft SQL Server 2012 Internals |
Page 2 of 5
Author: Kalen Delaney et al
Chapter 4 Special databases Each SQL Server instance comes with some special databases pre-installed. These are discussed here, and include:
Details of how to move undamaged and damaged system databases (other than the master database) are given with step-by-step instructions. Owing to its special nature, the master database is discussed separately. There is a large section covering tempdb. This is appropriate since tempdb can be used by all the databases on the server, as well as SQL Server itself, it is often the most used database. Objects used by tempdb are divided into user objects (e.g. temp tables), internal objects (e.g. sort units), and version store (e.g. old version of uncommitted rows required for transaction consistency). A slow tempdb can affect the performance of all other databases on the server, with this in mind various best practices are given, these include: fast disks (SSDs?), splitting files over several disks, keeps transactions short, and don’t interleave DDL and DML commands. Similarly contention on tempdb can also cause performance degradation, example code is provided to show and explain such contention, this should prove useful when investigating tempdb performance problems. Monitoring of tempdb space using DMVs is described, and some monitoring/capture SQL provided. A new feature of SQL Server 2012 is partially contained databases. Previously, restoring databases to other servers often caused problems since several expected server-level object were missing (e.g. logins, linked servers, jobs). The idea behind contained databases is to relax these restrictions. This section shows how to configure a contained database, and create contained users (so no instance level login is required). The chapter contains good descriptions of the system databases, together with their content and purpose. There are good links to other related chapters for further information. There are some useful lists of tempdb optimizations and best practices. I had expected some useful DMV queries that told me what was hogging tempdb, or what are the longest running transactions (both typical tempdb related problems), but none was given.
Chapter 5 Logging and recovery This chapter describes the purpose of logging (i.e. recovery and rollback), together with plenty of detail on how this is achieved, and additionally it makes good use of example code to illustrate the points made. The chapter starts with an overview of the transaction log internals, including writing data to the log before the data file (Write Ahead Log), and using checkpoints to write changed data to data files. The chapter then delves deeper into the phases of recovery (i.e. Analysis, Redo, and Undo). The relationship between data pages and the transaction log is explained, being linked by the Log Sequence Number (LSN). There’s an interesting discussion about virtual log files (VLF), which collectively form the log file itself. The state of the VLFs is explained, together with details of how to view them. Automatic truncation of the log VLFs is also described. The chapter provides a good understanding of how restores and backups work and how they interact with the log. There’s a short discussion on the types of backup. There’s a good description of the database recovery models, what they are (FULL, BULK_LOGGED, and SIMPLE) and why they’re important. There’s a useful list of minimally logged operations that you might find useful if you’re aiming to improve performance. In summary, if you want to know how the transaction log works, its importance in recovery and rollbacks, and put this into context of backups/restores, this chapter provides an excellent and detailed explanation. Although the chapter says log reading tools for 2012 do not yet exist, here is one from ApexSQL: www.apexsql.com/sql_tools_Log_features.aspx, which can be useful in determining who and when a change was made (as well as rolling back the change).
Chapter 6 Table storage Tables are central to relational databases and SQL Server. A table contains attributes (columns) about an entity, stored in rows (tuples). Each row has a unique identifier (typically a primary key) and relationships between tables are by means of primary key and foreign key dependencies. The chapter starts with a look at the process of table creation, the SQL used, and the need to store this definition in a source control system (SCS). A small diversion into table and column naming practices follows, including the avoidance of reserved words, following a naming standard, and choosing an appropriate data type. The use, range and detail of the various data types are described. The contentious use of NULL is discussed, with a practical outcome of using NOT NULL as a default where possible. Fixed-length versus variable length data types, the importance of collation, and the processing of NULLs are all discussed. The chapter proceeds by examining the identity property, which provides a simple unique counter for tables without a natural key or an inefficient one. A variety of related processing options are discussed including: SET IDENTITY_INSERT, @@IDENTITY, SCOPE_IDENTITY(), and RESEED. SQL Server 2012 introduced the sequence object as an alternative to the identity property, it has the advantage that the incremental sequence can be shared across tables. Another notable feature is it has a cycle property to reseed when the maximum/minimum value is reached. Code is provided to show how the sequence object can be altered. The next section is a deep dive into internal storage, detailing both the metadata views that track storage, and how data is actually stored on the pages. The primary metadata view examined is sys.indexes, which contains entries for each table, heap and index in the databases. Joining sys.indexes with sys.partitions and sys.allocation_units (or rather the undocumented sys.system_internals_allocation_units) provides key data for the underlying data storage pages. The three types of data pages (in-row, row-overflow, and LOB [large object]) are described in great detail. Each page is 8KB in size, and contains a page header, data rows, and the row offset array. The page header contains a wide range of metadata (e.g. PageID, nextPage, LSN, freeData etc), the row offset array points to the start of individual data rows on the page. Code is provided to examine the content of individual pages using the DBCC PAGE command. The structure of the data rows is also explained in detail. The storage of both fixed and variable-length rows is compared. Constraints are examined as a way of enforcing the different types of data integrity (i.e. entity, domain and referential). The use of primary key, unique, foreign key, check and default are all examined. This long chapter continues with the various ways of modifying a table. There are many things that can be changed including: adding or dropping columns or constraints, and changing a data type. Various examples and sample SQL is provided to illustrate these changes. At a lower level, the effect of these changes on the internals is explained. The internals explained so far have largely related to tables and indexes. This last section explains, again in great detail, heap modification internals. This chapter contains a mix of gentle introductory material (e.g. how to create and modify a table, column naming standards), and low-level detail (e.g. the meaning of the contents of bit array at the start of a data row). There are good links to other chapter for further information. There’s a good overview of data types, their ranges, and advantages. Some practical example SQL is included.
|
||||||
Last Updated ( Wednesday, 29 January 2014 ) |