Posts

Showing posts with the label SQL

Design a partition structure to support a data warehouse

Image
A table partition consists of a partition column, a partition function, and a partition scheme.  The partition column is the column in the table that holds the values that are used as boundary  values to group rows of data in a corresponding partition. Only one column can be defined  as the partition column.  SQL Server uses the partition column to determine which partition to store new and  existing rows in. SQL Server also uses it for partition elimination to determine which partitions  need to be read based on the WHERE clause or JOIN condition of a query. The partition function specifies a set of range boundaries based on the values of the partition  column. The boundaries can be defined as a RANGE LEFT or RANGE RIGHT. A boundary defined  as RANGE LEFT means that the boundary value belongs to its left partition. The boundary  value is the last value in the left partition. A boundary defined as RANGE RIGHT means that the  boundary value belongs to its right partition. The bou

Select appropriate indexes

Image
A query is considered selective if the execution plan utilizes a more efficient operation  such as an index seek rather than a table or index scan. SQL Server uses an internal cost-based  optimizer known as the SQL Query Optimizer to determine the execution plan with the lowest  cost while also taking in consideration the cost of finding candidate plans. The execution plan  operation chosen by the SQL Query Optimizer depends not just on its WHERE clause predicate  but also on other factors such indexes and column statistics. Examples of low selectivity queries include queries that aggregate sales data by year  or queries that average profit across products. Performance tuning of these non-selective queries  can be challenging using traditional rowstore indexes. For this matter, a columnstore index  is best suited because they are in-memory structures and outperform disk-based rowstore  index scans. For data warehouse workloads, a common index design approach is to implement a

Implement clustered, nonclustered, filtered, and columnstore indexes

Image
SQL Server would resort to do a table scan  operation, which is essentially a full read of all rows in a table. A table scan is typical on tables  with no clustered indexes, also known as heap tables. A table scan is the most inefficient read  operation that SQL Server performs while all data pages in the table are read. Table scans  increase memory, disk, and CPU resource utilization and cause queries to perform slow. To illustrate the performance impact of table scans, compare the execution plans of a query  executed against a table with a clustered index and a table without a clustered index (heap  table). For this comparison, use the [Dimension].[City] table in the Wide World Importers DW  database. Execute the following query to return all cities in the state of Florida from the [Dimension]. [City] table: SELECT [City], [State Province],[Country] FROM [WideWorldImportersDW].[Dimension].[City] WHERE [State Province] = '

Design an indexing solution

Image
A good understanding of the  data warehouse workload, its structure, and index types are all very important for designing a  proper indexing solution. The two main types of indexes that are part of almost every data warehouse index solution  are rowstore and columnstore indexes. Rowstore indexes store table or view data horizontally  based on rows of data. Columnstore indexes store table or view data vertically based on  column values. Rowstore indexes are the traditional indexes that have been present since the  early versions of SQL Server. Columnstore indexes are a newer type of in-memory index that  provide increased performance benefits over traditional rowstore indexes in data warehousing  workloads. Rowstore indexes Rowstore indexes can be defined as clustered or nonclustered row indexes. A clustered rowstore  index sorts and stores table or view data in rows in the order of the clustered index key.  The clustered index key is usually the table’s primary key, although

List all the columns in tables

For MS SQL Server: select * from information_schema . columns where table_name = 'tableName'

Nested Stored Procedure

You can nest stored procedures up to 32 levels. Error Msg 8164, Level 16, State 1, Procedure some_sp, Line 8 An INSERT EXEC statement cannot be nested. It can't nest. If  some_sp  tries to call  some_other_sp  with INSERT-EXEC, you will get an error message. Thus, you can only have one INSERT-EXEC active at a time. This is a restriction in SQL Server. Based on the requirement we can use OUTPUT parameter to pass data between SP.