Informatica Data Quality is a suite of applications and components that we can integrate with Informatica PowerCenter to deliver enterprise-strength data quality capability in a wide range of scenarios.
The IDQ has the following core components such as:
Data Quality Workbench: It is used to design, test, and deploy data quality processes. Workbench allows testing and executing plans as needed, enabling rapid data investigation and testing of data quality methodologies.
Data Quality Server: It is used to enable plan and file sharing and to run programs in a networked environment. The Data Quality Server supports networking through service domains and communicates with Workbench over TCP/IP.
Both Workbench and Server install with a Data Quality engine and a Data Quality repository. Users cannot create or edit programs with Server, although users can run a program to any Data Quality engine independently of Workbench by runtime commands or from PowerCenter.
Users can apply parameter files, which modify program operations, to runtime commands when running data quality projects to a Data Quality engine. Informatica also provides a Data Quality Integration plug-in for PowerCenter.
In Data Quality, a project is a self-contained set of data analysis or data enhancement processes.
A project is composed of one or more of the following types of component, such as:
IDQ has been a front runner in the Data Quality (DQ) tools market. It will provide a glance at the features these tools offer.
IDQ has two type variants, such as:
Informatica analyst: It is a web-based tool that can be used by business analysts & developers to analyze, profile, cleanses, standardize & scorecard data in an enterprise.
Informatica developer: It is a client-based tool where developers can create mappings to implement data quality transformations or services. This tool offers an editor where objects can be built with a wide range of data quality transformations such as Parser, standardizer, address validator, match-merge, etc.
Develop once & deploy anywhere: Both tools can be used to create DQ rules or mappings and can be implemented as web services. Once the DQ transformations are deployed as services, they can be used across the enterprise and platforms.
Role of Dictionaries
Projects can make use of reference dictionaries to identify, repair, or remove inaccurate or duplicate data values. Informatica Data Quality projects can make use of three types of reference data.
Standard dictionary files: These files are installed with Informatica Data Quality and can be used by various kinds of the component in Workbench. All dictionaries installed with Data Quality are text dictionaries. These are plain-text files saved in .DIC file format. They can be manually created and edited.
Database dictionaries: Informatica Data Quality users with database expertise can design and specify dictionaries that are linked to database tables, and that this can be updated dynamically when the underlying data is updated.
Third-party reference data: These data files are provided by third parties and are provided by Informatica customers as premium product options. The reference data provided by third-party vendors are typically in database format.
How to Integrate IDQ with MDM
Data cleansing and standardization is an essential aspect of any MDM project. Informatica MDM Multi-Domain Edition (MDE) provides a reasonable number of cleansing functions out-of-the-box. However, there are requirements when the OOTB cleanse functions are not enough, and there is a need for comprehensive functions to achieve data cleansing and standardization, e.g., address validation, sequence generation. The Informatica Data Quality (IDQ) provides an extensive array of cleansing and standardization options. IDQ can easily be used along with Informatica MDM.
There are three methods to integrate IDQ with Informatica MDM.
1. Informatica Platform Staging
Starting with Informatica MDM's Multi-Domain Edition (MDE) version 10.x, Informatica has introduced a new feature called "Informatica Platform Staging" within MDM to integrate with IDQ (Developer Tool). This feature enables to direct stage or cleanse data using IDQ mappings to MDM's Stage tables bypassing Landing tables.
2. IDQ Cleanse Library
IDQ allows us to create functions as operation mappings and deploys them as web service, which can then be imported in Informatica MDM Hub implementation as a new type of cleansing library defined as IDQ cleanse library. This functionality allows usage of the imported IDQ cleanse functions, just like any other out-of-the-box cleanse function. Informatica MDM Hub acts as a Web service client application that consumes IDQ's web services.
3. Informatica MDM as target
3.1 Loading data landing tables
Informatica MDM can be used as a target for loading the data to landing tables in Informatica MDM.
3.2 Loading data staging tables (bypassing landing tables)
Informatica MDM can be used as a target for loading the directly to staging tables in Informatica MDM, bypassing landing tables.