Posts

Perform domain management in DQS

Image
A knowledge base consists of domains. Each domain represents the data in a data field. Each  value in a data field or domain is known as a domain value. DQS provides the ability to validate,  cleanse, match and deduplicate values from any dataset against domain values in the DQS  Knowledge Base. Domains are created by performing a domain management activity. To create a DQS domain  for valid supplier names in the Suppliers Knowledge Base, follow these steps: Open the Data Quality Services client and connect to the DQS instance Click on Open Knowledge Base button from the Knowledge Base Management section Select Suppliers from the Knowledge Base list Click on Domain Management from the Select Activity list, then click Next Click on the Create A Domain icon Type the following information on the Create Domain window: Domain Name: SupplierName Description: List of valid Supplier Names Select String from the Data Type drop-down list Check the Use Leading Values option C

Determine appropriate use cases for a DQS Knowledge Base

Data Quality Services can be used for a wide variety of use cases to ensure that the data is suited  for business usage. Data can be cleansed manually or computer-assisted by establishing matching  and cleansing rules or by leveraging cloud-based services of reference data providers. Incorrect or bad data is often a symptom of weak or lack of data validation rules during data  entry. Other forms of inconsistent data can arise from data integration processes and data corruption.  Bad, incorrect and inconsistent data create inefficiencies in the business processes and  can lead to loss of revenue and increased operational costs. Bad data is often amplified during  data analysis, reporting, data mining and data warehousing. The main use cases for a DQS Knowledge Base involves the need for business users, information  workers or IT professionals to create, maintain and execute data quality operations  resulting from data incompleteness, inconsistency, inaccuracy, invalidity, dupli

Create a Data Quality Services (DQS) knowledge base

Image
Install DQS A Data Quality Knowledge Base (DQKB), is a repository of knowledge and metadata that describes  and defines the data domains and rules to cleanse and match data. The DQKB is used in  an organization to identify and correct invalid or incorrect data to ensure that data is consistent,  and trustworthy. A DQKB stores domain values that have been validated along with terms,  spelling errors, business rules, and reference data that can be used to perform data quality  cleansing and matching actions on the data. A Data Quality Services knowledge base can be created as a new and empty knowledge  base or by importing an existing DQS knowledge base from a DQS (.dqs) file. In our example,  we will be creating a Suppliers DQS Knowledge base. To create a new and empty DQS knowledge  base, follow these steps: Open the Data Quality Services client and connect to the DQS instance Click on New Knowledge Base button from the Knowledge Base Management section Type the following

Install Data Quality Services

Image
The installation of DQS is a two-step process which consists of running the SQL Server 2016  Setup wizard and a command prompt based application called the Data Quality Server Installer.  To start the installation process, locate the SQL Server 2016 installation media and run  Setup.exe. This will open the SQL Server 2016 Installation Center. Click on Installation menu option  and then click on New SQL Server Stand-Alone Installation Or Add Features To An Existing  Installation to launch the SQL Server 2016 Setup wizard. Click Next. On the next window,  select the installation type option Perform A New Installation of SQL Server 2016, if SQL Server  2016 Database Engine has not been previously installed. In this case, you will also need to  install SQL Server 2016 Database Engine as it is required for Data Quality Services. If there is already an installation of SQL Server 2016 Database Engine and you would like to  add DQS to the existing instance, select Add Features To An Ex

Deploy SSIS packages by using the deployment utility, SQL Server, and file systems

Image
For project deployment, you  must ensure that all the related objects needed for the package execution are also part of the  package. For example, when a Connection Manager is created, it is specific for the package  within which it was created, unless it is promoted to a project-level Connection Manager by  right-clicking it and selecting Convert To Project Connection. Similarly, any parameter that you want to be included at the project level, such as, Server- Name, DatabaseName, FolderName, etc. can be created at the project level so it only needs to  be changed in one place and will be available for all the packages in the project. And once the  project has been deployed to a different server, for example to the Development, QA, or Production  server, the parameter value only needs to be changed in one place. The Project.params tab opened with project parameters, and  also under Connection Managers, you see the project level connections.

Create an SSIS catalog

Image
SQL Server 2012 introduced a new paradigm in storing and deploying SSIS packages. Before  that release, all packages were individually managed and deployed. The SSIS DB catalog and SSISDB database serves as a centralized place where deployed SSIS  packages can be stored along with everything related to it, including: Project and package parameters Configuring runtime values for packages Executing and troubleshooting packages Managing Integration Services server operations You find the SSISDB catalog in the folder labeled Integration Services Catalogs. This is where  you find the Projects folder containing each project with all its packages. And the Environments  folder contains the package configuration and environment values, which can be set  before executing the packages. Each SQL Server instance gets only one SSISDB. You also find another SSISDB in the Databases folder in SQL Server (Figure 2-113). This is essentially  a

Error handling at package level in SSIS

Image
There are many causes of errors and failures when a package is running, either in Debug Mode  or in Production. One of the best and simplest ways to manage error handling is with the Event  Handler tab in the SSIS Visual Studio design environment. Under the Event Handler drop-down  list, you can see all of the event types that can be triggered to fire off a notification, email, or  insert the error details into a file, or a database table. From this tab, you can choose from any (or all) of the components within your whole package  to contain an event handler. Although each component can have its own custom event  handler, generally the most common error handling method is to put the error handling at  the highest level, the package, so that all of the errors float up and can be processed from one  centralized location. In every other respect, you create Event Handlers just like any package—with Control Flows  and Data Flows. In Figure 2-110, we have added a