Clean data by using DQS knowledge

The DQS Client allows you to run a Data Quality Project to perform a Data Cleansing Activity. During the Data Cleansing Activity, DQS detects new, invalid, and correct values based on existing valid values, Synonyms and Term-Based Relations. The interactive data cleansing allows you to accept or reject suggestions and corrections that the DQS cleansing activity identified. Once all records have been processed, you can export the data cleansing results data and cleansing info.

To cleanse data using DQS knowledge follow the steps below.
  • Open the Data Quality Services client and connect to the DQS instance
  • Click on New Data Quality Project button from the Data Quality Projects section
  • Type a name and description for the Data Quality Project
  • Select Suppliers from the Use Knowledge Base drop-down list
  • Click on Cleansing under the Select Activity list, then click Next
  • Select Excel File from the Data Source drop-down list
  • Browse and select the SuppliersDomain.xlsx file
  • Select the SuppliersforDataCleansing worksheet from the Worksheet drop-down list
  • Under the Mappings section, select SupplierName (nvarchar) from the Source Column  drop-down list, then select SupplierName from the Domain drop-down list in the corresponding row
  • Click Next, then click Start to run the data cleansing analysis results. Upon completion click Next
  • Review the results under the Corrected and Suggested values in the Profiler tab as shown, then click Next
  • Review the records under each tab (Suggested, New, Invalid, Corrected, Correct). You can manage and review the data cleansing results. You can Approve or Reject the corrections or make your own corrections as seen. Click Next
  • To Export the cleansing results, select the Destination Type, Database or filename and Output Format options and click on the Export button, then Click Finish.














Comments

Popular Posts

Non-blocking, semi-blocking, and blocking transforms in SSIS

Implement additive, semi-additive, and non-additive measures

Implement an ETL solution that supports incremental data extraction