Implement clustered, nonclustered, filtered, and columnstore indexes

SQL Server would resort to do a table scan operation, which is essentially a full read of all rows in a table. A table scan is typical on tables with no clustered indexes, also known as heap tables. A table scan is the most inefficient read operation that SQL Server performs while all data pages in the table are read. Table scans increase memory, disk, and CPU resource utilization and cause queries to perform slow.

To illustrate the performance impact of table scans, compare the execution plans of a query executed against a table with a clustered index and a table without a clustered index (heap table). For this comparison, use the [Dimension].[City] table in the Wide World Importers DW database.




















Execute the following query to return all cities in the state of Florida from the [Dimension].[City] table:

SELECT [City], [State Province],[Country]
FROM [WideWorldImportersDW].[Dimension].[City]
WHERE [State Province] = 'Florida'

The execution plan and read operation tooltip with a clustered index


















Notice that 116,295 rows were read (Number of Rows Read) to retrieve 4,024 rows (Actual Numbers of Rows).

The following script provides an example of a nonclustered index that can be defined to cover the query in the previous example in the [Dimension].[City] table:

CREATE NONCLUSTERED INDEX NC_IX_StateProvince
ON [Dimension].[City] ([State Province])
INCLUDE ([City])

Notice that the key column is [State Province] and the included column is [City]. The key column is used in the WHERE clause and the included column for the SELECT portion.

The execution plan and read operation tooltip for the query with a covering index


















After the covering nonclustered index was added, SQL Server performed an index seek instead. After the covering nonclustered index was added, SQL Server performed an index seek instead.

Another feature of nonclustered indexes is the ability to add a filter to include or exclude rows based on a column filter predicate. These types of nonclustered indexes are referred to as filtered indexes.

SELECT [City]
FROM [WideWorldImportersDW].[Dimension].[City]
where [State Province] = 'Florida'
and [Latest Recorded Population] > 0

The definition of the filtered nonclustered index to cover the query is as follows:

CREATE NONCLUSTERED INDEX NC_IX_StateProvice2
ON [Dimension].[City] ([State Province])
INCLUDE ([City])
WHERE [Latest Recorded Population] > 0

As a general rule, query performance is improved by reducing the number of rows that SQL Server has to read. Table and index scans should be avoided during query execution for selective queries.

Comments

Popular Posts

Non-blocking, semi-blocking, and blocking transforms in SSIS

Implement additive, semi-additive, and non-additive measures

Implement an ETL solution that supports incremental data extraction