Not all Data Flow transforms are created equal. In fact, some are much faster and require a lot less memory (non-blocking transforms). Others are slower while consuming much more memory (blocking transforms), and some are in between (semi-blocking transforms). We don’t always have a choice if we need to use a certain transform to do a certain specified task, but it is good to know the differences between these types of transforms so that we can better understand performance and resource utilization issues. Synchronous and asynchronous There is another related aspect about Data Flow transforms, which is their ability to quickly process a row as they are coming into the transform, independently of any other rows that came before or after (synchronous transforms). The other type of transform needs to be dependent on some or all of the rows that come before and after (asynchronous transforms). On the whole, non-blocking transforms a...
Control Flow It is called “control flow” because this is where you design the sequence of steps in your package and direct how they flow from one to the next. There are four main types of items and objects in the Control Flow tab. Control Flow Tasks Precedence Constraints Containers Connection Managers The most commonly used Control Flow tasks are grouped under the Common section in the SSIS Toolbox window. This task can be done against the Filesystem level, Database level, Analysis Services level, Package level, Hadoop level, etc. to consume, transform, and load data. Precedence Constraints You can control the sequence of tasks that get executed based on completion, success, or failure of a task. On Success (Green arrow) Proceeds to the next step only if the current step completes with Success. On Completion (Black arrow) Proceeds to the next step regardless if current step completes with a Success or Failure. On Failure (Red arrow) P...
A frequent problem that you may experience when you bring data from multiple sources that need to be unified and consolidated and/or needs to be deduplicated. In the case of an enterprise maintaining multiple customer or product sales information in for various business entities, e.g. retail and web sales, or possibly different CRM programs that are a result of a company merger, all of this information has to be brought under one roof within the data warehouse. SQL Server provides two fuzzy transformations to help with such scenarios: Fuzzy grouping Fuzzy lookup These transformations can be used independently or in concert to assist with unification and deduplication of your data. Both the fuzzy transformation algorithms create and utilize temporary tables created within SQL Server. Fuzzy grouping Fuzzy grouping is used primarily for deduplicating and standardizing values in column data. Fuzzy grouping has input parameters a...
Comments
Post a Comment