Implement sliding windows

A sliding window partitioning strategy is a common data management practice in most data warehouse environments with a data retention policy that establishes how long data is retained in operational database tables. The retention policy can establish how often data is purged or archived and how far back data should be retained in the operational database.

Highly regulated industries, such as healthcare and finance, have legal requirements to retain data for a certain period before it can be deleted. Organizations in these regulated industries with high volumes of data often put a data archival and deletion strategy in place that allows them to maintain compliance while keeping their operational databases streamlined with data that is actively accessed. A sliding window partition strategy provides a mechanism to move large volumes of data in and out of a table quickly and with minimal or zero disruption.

For example, consider a sliding window partition strategy where only data for the past three years (current year + previous two years) should be maintained in the [Fact].[Sale] table. This is considered the active data. Data older than three years should be moved into an archive table. On January 1st, 2016, the sliding window partition process kicks-off and performs the following three main tasks:
  1. Move the 2013 year data from the table partition into an archive table by using the SWITCH function.
  2. Merge the 2013 and 2014 year ranges in the partition function by using the MERGE function.
  3. Add a new empty 2016 year partition by modifying the partition function by using the SPLIT function.











First, define the partition function, partition scheme, and table that would be present until midnight on December 31, 2015.

Partition function
CREATE PARTITION FUNCTION [MostRecent3Years_PF](date) AS RANGE RIGHT
FOR VALUES (N’2013-01-01’, N’2014-01-01’, N’2015-01-01’);

Partition scheme
CREATE PARTITION SCHEME [MostRecent3Years_PS] AS PARTITION [MostRecent3Years_PF]
ALL TO ([USERDATA]);

Create table [Fact].[SaleModified]
CREATE TABLE [Fact].[SaleModified](
[Sale Key] [bigint] NOT NULL,
[Customer Key] [int] NOT NULL,
[Stock Item Key] [int] NOT NULL,
[Invoice Date Key] [date] NOT NULL,
[WWI Invoice ID] [int] NOT NULL,
[Quantity] [int] NOT NULL,
[Unit Price] [decimal](18, 2) NOT NULL,
[Lineage Key] [int] NOT NULL,
CONSTRAINT [PK_FactSaleModified] PRIMARY KEY ([Sale Key],[Invoice Date Key])
) ON [MostRecent3Years_PF]([Invoice Date Key])

The next step is to move the 2013 year data currently stored in PARTITION 2 out of the [Fact].[SaleModified] table and into an archive table by using the SPLIT function as follows:

Switch partition
ALTER TABLE [Fact].[SaleModified]]
SWITCH PARTITION 2 TO [Fact].[SaleModified_Archive]

PARTITION 2 of [Fact].[SaleModified] is empty now that the data has been switched out. PARTITION 1 was already empty because the partition function created this empty partition automatically. Now, you can proceed to merge PARTITION 2 with PARTITION 1. Because both partitions being merged are empty, there is no data movement and it is a metadata only change. Execute the MERGE function as follows:

Merge range in partition function
ALTER PARTITION FUNCTION MostRecent3Years_PF()
MERGE RANGE (‘2013-01-01’)

At this point, there is no partition to store 2016 year data. Next, you need to create a new empty partition by using the SPLIT function as follows:

Split function
ALTER PARTITION FUNCTION MostRecent3Years_PF()
SPLIT RANGE (‘2016-01-01’)

The staging table’s 2016 year data is stored in PARTITION 4, similar to the target table. The partition switch syntax is as follows:

Partition switch from the staging table to [Fact].[SaleModified] target table
ALTER TABLE [Fact].[SaleModified_Staging]
SWITCH PARTITION 4 TO [Fact].[SaleModified] PARTITION 4

For example, suppose that updated 2016 year data has been loaded into the staging table. Before data can be switched in from the staging table, you need to have an empty partition available in the target table. You can use the TRUNCATE TABLE statement to truncate the specific partition in [Fact].[SaleModified], in this case PARTITION 4, as follows:

Truncate table with partitions argument
TRUNCATE TABLE [Fact].[SaleModified] WITH (PARTITIONS (4))

A sliding window partition strategy allows you to manage a finite number of partitions in
large data warehouse tables by switching new data in and switching historical data out with minimal overhead. A sliding window partition can be managed automatically by creating an automatic sliding window partition that dynamically creates the necessary steps, such as a partition switches, partition merges, and partition splits. You can read more about how to implement an automatic sliding window in a partitioned table at https://technet.microsoft.com/en-us/library/aa964122. Although, this article was written for SQL Server 2005, most of the concepts still apply for SQL Server 2016.

Comments

Popular Posts

Non-blocking, semi-blocking, and blocking transforms in SSIS

Implement additive, semi-additive, and non-additive measures

Implement an ETL solution that supports incremental data extraction