SQL Query Design Patterns and Best Practices
Article Index
SQL Query Design Patterns and Best Practices
Chapters 5 - 10
Chapter 11 to end; Conclusion

Chapter 5: Using Common Table Expressions 

Common Table Expressions (CTEs) allow you to compartmentalise and organise your queries, to provide increased readability and hopefully easier maintenance. CTEs provide a dataset that an immediate subsequent query can then use. Example usage is provided, from simple CTEs to the more complex multiple CTEs. Another use of CTEs is in processing hierarchical data via its ability to recursively call itself. This is often useful in decoding parent/child relationships (e.g. organization structures). A useful example of this is provided.

The chapter provides a good overview of CTEs, what they are and how they can be used. I must admit, for me, I often find CTEs are a performance bottleneck, and will readily convert them to temp tables to improve performance.

Chapter 6: Analyze Your Data Using Window Functions 

Window functions are functions that act over a subset of rows, called a window. The basics of Window functions are explained together with a brief listing of the functions, including: row_number, rank, ntile, lag, and avg. Some useful example code is provided. More complex processing is described, using partitioning (splitting large results into smaller ones) and frames (allowing windows different from the partition). The rest of the chapter concentrates on fulfilling various business scenarios (e.g. days between orders, running totals, first n rows in every group). Hopefully these examples will provide useful templates for your own queries.

This chapter provided some useful examples of real-world business problems and how they can be answered using window functions. 

Chapter 7: Reshaping Data with Advanced Techniques 

Sometimes, there’s a need to change output, so it’s easier to read. One such case is to change rows to columns, or columns to rows. The PIVOT operator allows you to convert row output into column output. A simple example of this is provided, before looking at a more complex example using PIVOT dynamically – run using sp_executesql. Next, the UNPIVOT operator is discussed. This does the reverse of the PIVOT operator, rotating column output into row output.  The chapter ends with a look at how hierarchical data can be coded/decoded, making use of a column to store the level information via separators e.g. /, /1/, /1/1, /2/ etc.

This chapter provided a useful overview of the PIVOT/UNPIVOT operators.

Chapter 8: Impact of SQL Server Security on Query Results 

A lot could be written about SQL Server security and query results. Here, we look at a specific problem, where you pass your working SQL code to a colleague, and it doesn’t give them the same results – owing to security differences.

After setting the problem background, the chapter briefly discusses security at the server, database, schema, table, row, and column levels. The impact on security of GRANT and DENY is highlighted. It’s suggested you should check the security settings in SSMS (via user properties), and possibly talk to the DBA to gain further insight. I found this a peculiar chapter, it is concerned with a small area of security, where so much more exists.

Chapter 9: Understanding Query Plans 

SQL code describes logically how to retrieve/modify data. SQL Server uses your SQL code to create a query plan (also called an execution plan) that will physically implement these logical requirements. For example, it will determine if an index is used, and how it might be used (i.e. seek, scan, lookup). Since creating the query plan is CPU intensive, it makes sense to reuse the plans, and this is briefly discussed. The various phases of optimization are outlined (parsing, binding, optimization, execution). 

There’s a brief look at estimated and actual execution plans, and how they can be obtained (e.g. SET SHOWPLAN_XML ON). Execution plans can change with subsequent runs, which sometimes results in slower query performance, it’s possible to compare execution plans to determine what has changed, and might be the cause of slower performance. Lastly, there’s a look at the Analyze Actual Execution Plan, which can guide you to the problems with your execution plan.

I would expect this chapter to give me an overview of the various common operators in the query plan, but nothing is said. Also, the execution plan tells me where the time/resources are spent, giving me a clue as to what to optimize – again this is not discussed. The term ‘hint’ is used, but not explained. In this introductory book, I don’t think it makes sense to include the various step in optimization – what does it add for the reader? This chapter is a very selective and misplaced look at understanding query plans. 

Chapter 10: Understanding the Impact of Indexes on Query Design 

Indexes are the main way to improve query performance. There’s a useful discussion about the importance of indexes on columns that are used in WHERE, JOIN, HAVING, and ORDER BY clauses. There’s a discussion on how to create and drop indexes, together with examples. Index usage can be shown via the DMV sys.dm_db_index_usage_stats and this is discussed. Sometimes, important indexes are rarely used (e.g. quarterly), so I would expect some warning about being cautious when dropping indexes that appear to be unused.

Next, the chapter looks at the traditional topics of clustered and non-clustered indexes. The discussion of filter indexes is quite wordy – in essence it’s an index with a WHERE clause. The definition of a covering index is wrong, it says “Covering indexes are non-clustered indexes with the INCLUDE clause” – however, a covering index is one where all the columns needed by the query can be obtained from the index, it may or may not have an INCLUDE clause.

The last section looks at how the performance of indexes can degrade with time due to fragmentation. This can be corrected by rebuilding or reorganizing the indexes. This can be time consuming and is typically done outside normal business hours.

If this book is largely introductory, and aimed at people retrieving data, while the importance of indexes is noteworthy, do we really need to see info about creating/deleting indexes, and index maintenance? These are largely DBA tasks. Similarly, allow_page_locks and statistics_norecompute are mentioned, but I don’t think these are appropriate for the level of reader.

The section on non-clustered indexes says “You can create non-clustered indexes after a unique clustered index has been created on a view.” – but, you don’t create indexes on views. I suspect the author meant to use the word table instead. That said, you can create non-clustered indexes on tables that do not have a clustered index.

Banner



Last Updated ( Tuesday, 18 July 2023 )