Scans and Seeks

TSQL Tuesday
The reason I look at executions plans is improving query performance. As a first pass I look to improve query performance by modifying or adding indexes. The first thing I look for are scans:

Table scan operator icon Table Scan
Clustered index scan operator icon Clustered Index Scan
Nonclustered index scan operator icon Nonclustered Index Scan
My goal on the first pass is to improve these scans to seeks
Clustered index seek operator icon Clustered Index Seek
Nonclustered index seek operator icon Nonclustered Index Seek

For this T-SQL Tuesday post I decided to dive into these operators a bit deeper.

A table scan returns all the rows in a table.  Obviously this can be very expensive if you have a large table.  A clustered index scan reads the entire index.  A nonclustered index scan also reads the entire index.  With all three scans a where predicate will restrict the number of rows returned.

Whenever I see scans I take a look at execution plans and run the query through Database Engine Tuning Advisor (DTA), analyzing for the possibility of adding an index which will improve query performance through seeks.

A clustered index seek will have performance improvements over a scan by, "process[ing] only those rows that satisfy this SEEK:() predicate".  Performance is increased because all the data pages in the index are not read and the index key is used to look up the required values.

Because a clustered index stores actual data at the page level a clustered index seek or scan will outperform a nonclustered index seek or scan because there isn't an additional step to look up data. Additional performance will be seen if the index is a covering index. I'd like to say that you will receive a performance benefit if the order by clause matches the order of a clustered index, however set based theory provides no order guarantee.  Adding an order by clause to guarantee order has the potential to decrease performance regardless of the clustered index.

In addition to modifying indexes to improve query performance it's also important to look at the where clause and determine if the query is attempting to return too much data.  Cardinality, selectivity, and covering indexes should also be considered.  A non-covering index will result in a key lookup, another plan operator, which increases the number of reads.  It is also of critical importance that prior to testing you ensure that index and statistics maintenance have been performed.  I've seen many times where developers spend a lot of time delving into query modifications to improve performance on a database with heavily fragmented indexes and out of data statistics.  In some of these situations regular index and statistics maintenance was all that was required to increase performance significantly.