The Informatica Data Quality Methodology

3y ago
47 Views
5 Downloads
1.00 MB
16 Pages
Last View : 5m ago
Last Download : 3m ago
Upload by : Julia Hutchens
Transcription

The Informatica Data Quality MethodologyA Framework to Achieve Pervasive Data QualityThrough Enhanced Business-IT CollaborationW H I T E PA P E R

This document contains Confidential, Proprietary and Trade Secret Information (“Confidential Information”) ofInformatica Corporation and may not be copied, distributed, duplicated, or otherwise reproduced in any mannerwithout the prior written consent of Informatica.While every attempt has been made to ensure that the information in this document is accurate and complete, sometypographical errors or technical inaccuracies may exist. Informatica does not accept responsibility for any kind ofloss resulting from the use of information contained in this document. The information contained in this document issubject to change without notice.The incorporation of the product attributes discussed in these materials into any release or upgrade of anyInformatica software product—as well as the timing of any such release or upgrade—is at the sole discretion ofInformatica.Protected by one or more of the following U.S. Patents: 6,032,158; 5,794,246; 6,014,670; 6,339,775; 6,044,374;6,208,990; 6,208,990; 6,850,947; 6,895,471; or by the following pending U.S. Patents: 09/644,280;10/966,046; 10/727,700.This edition published May 2010

White PaperTable of ContentsExecutive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Meeting the Data Quality Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3The Importance of Business-IT Collaboration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5Role-Based Tools for Enhanced Collaboration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5Step 1: Profile the Data for Content, Structure, and Anomalies . . . . . . . . . . 7Step 2: Establish Data Quality Metrics and Define Targets . . . . . . . . . . . . . 8Step 3: Design and Implement Data Quality Business Rules . . . . . . . . . . . . 9Step 4: Build Data Quality Rules into Data Integration Processes . . . . . . . 10Step 5: Review Exceptions and Refine Rules . . . . . . . . . . . . . . . . . . . . . . 11Step 6: Monitor Data Quality Versus Targets . . . . . . . . . . . . . . . . . . . . . . . 12Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13The Informatica Data Quality Methodology1

Executive SummaryThe three elements of any data quality initiative are people, processes, and technology. Astructured, well-defined methodology is essential to orchestrating these three elements to derivethe greatest payback from a data quality initiative.While the value of a data quality methodology may seem self-evident, too many organizationsapproach data quality initiatives with ill-defined plans that introduce risks of confusion, overlookeddetails, redundant efforts, and subpar results.A strategic and systematic methodology enables you to properly scope your data quality project,engage business and IT stakeholders with clearly defined roles and responsibilities, and equipthem with the right technology and tools to tackle the data quality challenge.This white paper examines the implications of poor data quality and introduces the Informatica data quality methodology, a six-step framework that extends from initial profiling to continuousmonitoring, toward the objective of making high-quality data pervasive throughout the enterprise.It shows you how your business and IT users—business analysts, data stewards, and IT developersand administrators—can collaboratively use the Informatica data quality solution through eachof the six steps to embed data quality across all data domains and applications throughout theextended enterprise.2

White PaperMeeting the Data Quality ChallengeThe performance of your business is tied directly to the quality and trustworthiness of its data.With high-quality data, your business is poised to operate at peak efficiency. High-quality dataimproves your competitive advantage and enhances your ability to: Acquire and retain customers Optimize sales and financials Run efficient supply chain and production processes Eliminate costly operational errors Make smart, timely business decisions Rapidly penetrate new marketsWhile most businesses recognize the theoretical importance of data quality, many wait untilpoor-quality data takes a bite out of operational efficiency and profitability before taking action.Consequences can range from customer service degradation, supply chain mistakes, and financialreporting errors to major operational failures that can cost millions of dollars a year.Similarly, organizations often take an ad hoc approach to data quality—implementing quick fixesat a departmental or functional level that fail to comprehensively address data quality weaknessesacross the enterprise and are ultimately short-sighted and unsustainable.The costs are high. More than 140 companies surveyed by the analyst firm Gartner estimatedthey were losing an average of 8.2 million a year because of poor data quality. Losses of morethan 20 million a year were cited by 22 percent of respondent organizations, and 4 percent putannual losses at more than 100 million.1“While losses of millions of dollars are significant, we believe these estimates understate thetrue financial impact on most organizations—the actual magnitude of the problem is typically fargreater (by orders of magnitude) than is perceived by business and IT leaders,” Gartner’s reportsays.Gartner Inc., “Findings from Primary Research Study: Organizations Perceive Significant Cost Impact fromData Quality Issues,” August 2009.1The Informatica Data Quality Methodology3

To attack this problem, organizations need to invest in the people, processes, and technologiesnecessary to transform flawed data into trusted, actionable business information available to allstakeholders whenever and wherever they need it. The best data quality initiatives have these fourcharacteristics: Collaborative. Business and IT share responsibility for data quality, with clearly definedroles and technology suited to the unique skills and perspectives of business analysts, datastewards, and IT developers and administrators. Proactive. Business and IT recognize that all organizations suffer some degree of poor dataquality and proactively profile data to identify and correct problems before they materiallyimpact business performance. Reusable. Data profiling and cleansing business rules can be reused across any number ofapplications to streamline and accelerate processes and help ensure high standards of quality. Pervasive. The data quality environment will extend to all stakeholders, data domains, projects,and applications regardless of where the data resides, whether on premise, with partners, or inthe cloud.For data quality to be most effective, it needs to be driven by a methodology that incorporates thecharacteristics defined above. Ideally, the methodology will be overseen and implemented by adata governance body, or it may be formalized in a center of excellence.Informatica’s six-step methodology is designed to help guide data quality from the initial step ofprofiling to the ongoing discipline of continuous monitoring and optimization. Over nearly 10 years,the Informatica data quality methodology has evolved to become a mature and proven frameworkthat has helped guide implementations in organizations around the world.The methodology aligns with the Informatica data quality solution, which delivers the full rangeof data quality capabilities that your company needs to ensure that all its data is complete,consistent, accurate, and current. The solution consists of several packages optimized for specificuses: Informatica Data Quality , Informatica Data Explorer , and Informatica Identity Resolution . Informatica Data Explorer. With role-based tools to promote collaboration between businessand IT, this data profiling software discovers and analyzes the content, structure, anddeficiencies of any type of data, in any source. Informatica Data Quality. The software executes cleansing, parsing, standardization, andmatching processes and enables ongoing monitoring in visual scorecards or dashboards. Aswith Informatica Data Explorer, it features role-based tools to enable business and IT to worktogether. Informatica Identity Resolution. This software enables organizations to search and matchidentity data from more than 60 countries in batch and real time, across multiple enterprise orthird-party applications.4

White PaperThe Importance of Business-IT CollaborationA lack of collaboration between business and IT is a key reason why many data quality projectsfail to live up to their potential. The two camps have traditionally relied on spreadsheets,documents, emails, and other tedious and imprecise mechanisms to communicate on data qualityrequirements.Inevitably, it’s difficult for business analysts and data stewards to outline data quality businessrequirements in clear-cut terms that IT can understand. Misinterpretation, delays, high costs, andsubpar results are common simply because business and IT are speaking two different languages,with no common framework. Critical details can be lost in translation.Greater business-IT collaboration is increasingly recognized as essential to data quality andrelated data management initiatives. For instance, 64 percent of respondents to a survey by TheData Warehousing Institute reported that collaboration is an issue for data integration in theirorganizations.2“More business people are getting their hands on data integration,” TDWI research senior managerPhilip Russom wrote in TDWI’s What Works magazine. “Stewardship for data quality has set asuccessful precedent. This form of collaboration ensures that data integration truly supports theneeds of the business.”Role-Based Tools for Enhanced CollaborationThe Informatica data quality solution provides a foundation for collaboration between businessand IT. It features role-based tools engineered to enable business analysts, data stewards, and ITdevelopers and administrators to make the most of their unique skill sets and communicate withall stakeholders in the process.These role-based tools present different views of the same data tailored to both business andIT. For instance, the IT developer sees technical versions of data and rules in a developmentenvironment. The business analyst sees a nontechnical rendering of the same data in abrowser-based tool. Business and IT can work with identical data and rules, in terms that eachunderstands, in a common environment that promotes joint ownership.Shareable bookmarks and notes to communicate findings, requirements, results, and statusenable team members to accelerate and streamline data quality processes, across multipleproject groups, geographic locations, and time zones. Rules can be developed from thesecommunications and viewed as part of the profiling results, greatly reducing the risk ofmisunderstandings about requirements.The Data Warehousing Institute, “Collaborative Data Integration,” TDWI What Works, August 2009.2The Informatica Data Quality Methodology5

Three role-based tools—Informatica Analyst, Informatica Developer, and Informatica Administrator—are common to both Informatica Data Explorer and Informatica Data Quality. Informatica Analyst: For Business Analysts and Data Stewards. By rendering data in semanticterms, the browser-based tool equips business analysts and data stewards with capabilities toprofile data, create and analyze quality scorecards, manage exception records, develop and userules, and collaborate with IT. Informatica Developer: For IT Developers. The Eclipse-based environment allows developers todiscover, access, analyze, profile, and cleanse data, regardless of its location. Developers canmodel logical data objects, combine data quality rules with sophisticated transformation logic,and conduct mid-stream profiling to validate and debug logic as it’s developed. Informatica Administrator: For IT Administrators. This tool gives IT administrators centralizedconfiguration and management capabilities. Administrators can monitor and manage security,user access, data services, and grid and high-availability configurations.With an understanding of the components in the Informatica data quality solution, we canexamine the six steps of the Informatica data quality methodology and how stakeholders can usethe technology in each step. Figure 1 illustrates these six steps.Step 1Profile the dataData StewardStep 2Establish Metricsand Define TargetsStep 6Data StewardMonitor Data QualityVersus TargetsStep 5Data StewardReview Exceptionsand Refine RulesThe InformaticaData QualityLife CycleData StewardStep 3Define and ImplementData Quality RulesStep 4Build Data Quality Rulesinto Data IntegrationProcessesIT DeveloperFigure 1. The Informatica data quality methodology extends from an initial profiling phase to ongoingmonitoring and optimization.6IT Developer

White PaperStep 1: Profile the Data for Content, Structure,and AnomaliesThe first step is to profile data to discover and assess your data’s content, structure, andanomalies. Profiling identifies strengths and weaknesses in data and helps you define yourproject plan. A key objective is to pinpoint data errors and problems, such as inconsistencies andredundancies, which can put business processes at risk.A thorough data profiling exercise gives you a foundation for data quality success. By identifyingproblems up front, you avoid costly and time-consuming remediation down the road. As problemsare identified, IT and business personnel investigate each data attribute and generate metadatathat describes it. This metadata, or data about data, is used to cleanse data downstream or duringtransformation processes.Business analysts, data stewards, and IT developers can and should collaborate on data profiling.Informatica Data Explorer helps to bridge the collaboration gap with role-based data profilingtechnology. Business analysts and data stewards use Informatica Analyst to assess data quality,identify anomalies, build business rules, and create scorecards.Developers use Informatica Developer to work with the output from business users, or to generatetheir own data profiles. The tool gives developers greater flexibility and functionality to, forinstance: Build, deploy, and centrally manage reusable data quality rules Provision data physically or virtually at any latency Leverage prebuilt rules for matching and address cleansing Reuse profiling and rules specifications across any application Access all data quickly to accelerate data quality projectsFigure 2 illustrates the Informatica Data Analyst interface for data profiling.Bank Avoids 1.5 MillionCost with Informatica DataExplorerInformatica Data Explorer acceleratedan effort to profile data from 32 legacyapplications to create a customer datawarehouse at the Banco Nacional de CostaRica. Bank officials estimate the softwaresaved 1.5 million in labor costs that wouldhave been necessary with manual coding.With business and IT collaboration, theprofiling exercise laid the foundation for adata quality initiative that generated accurateand trusted data from disparate sources toimprove customer relationship managementand profitability.“Informatica Data Explorer is marvelous fordiscovering the quality of data, becausethe results are obtained quickly and theonly limits to what you can do are in yourmind—the tool always offers more,” says SergioRodriguez, the bank’s director of databasesand strategic information.Figure 2. Informatica Data Analyst provides a browser-based environment for data profiling by business users.The Informatica Data Quality Methodology7

Step 2: Establish Data Quality Metrics and Define TargetsNext, you need to define metrics to measure the quality of data within your key application datafields and define individual data quality targets for each data field. The metrics should be basedon the six dimensions of data quality:1. Completeness: What data is missing or unusable?2. Conformity: What data is stored in nonstandard formats?Data Quality Helps MedicalSupplier Save 1.4 Million inMailing Costs3. Consistency: What data values give conflicting information?The use of metrics and scorecards inInformatica Data Quality was a key ingredientin data quality success for Smith & Nephew,the London-based global healthcare company.The metrics track the company’s success inan enterprise-wide initiative to cleanse andintegrate data from multiple SAP instances.6. Accuracy: What data is incorrect or out of date?In all, the Informatica-based data qualitysolution helped Smith & Nephew save 1.4million in mailing costs by cleansing customerdata and reduced SKU by 50 percent byincreasing visibility through metrics andscorecards.As with profiling, establishing metrics and defining data quality targets should be a collaborativeand iterative effort. Informatica’s data quality solution gives business and IT a common platformto build and refine metrics that can be tracked in data quality scorecards, which may be readilyshared among stakeholders by emailing a URL.“We wanted to invest in robust data qualitytools, which were scalable and which couldhandle large volumes of data. We also wantedto work with one vendor who continuesto support robust business solutions withbusiness metadata management and pointof entry,” says Barbara Latulippe, Smith &Nephew enterprise data architect. “Time andagain, it was only Informatica that couldcomfortably provide solutions in all theseareas.”4. Duplicates: What data records or attributes are redundant?5. Integrity: What data is not referenced or otherwise compromised?You can also define custom data quality dimensions applicable to your business requirements. Forexample, you can establish metrics to reflect the dimension of timeliness (when data is availableversus when it is expected to be available), or currency (how up to date the information is).Tie your metrics to the business impact of data quality. For instance, correlate such businessissues as stock turnover and customer shipments with data quality dimensions that can affectthem (consistency and accuracy of inventory data, or duplicate customer data).The metrics can also be viewed in Web dashboard, which offers robust drill-down and analyticreporting. Both scorecards and dashboards enable you to monitor data quality on an ongoingbasis, and as you define metrics, you’ll also want to establish data quality thresholds that willtrigger an email alert if breached.Figure 3 depicts a customer data quality scorecard with metrics performance on key data qualitydimensions.Figure 3. Informatica Analyst provides a metrics scorecard that tracks performance on key dimensions ofdata quality.8

White PaperStep 3: Design and Implement Data QualityBusiness RulesThe next step is to define your data quality business rules—reusable business logic that governshow data is cleansed and parsed out to populate target application fields. Business and IT teamsachieve the best results when working together to design, test, refine, and implement data qualitybusiness rules using role-based functionality.For instance, business analysts and data stewards can use Informatica Analyst to profile, analyze,and create data quality scorecards. They can drill down to specific records with poor data qualityto determine impact on the business and how to fix issues. The tool enables business users toshare data quality metrics and reports by simply emailing a URL to IT colleagues; it also letsbusiness users work with developers to specify, validate, configure, implement, and test dataquality rules.IT s

Pervasive . The data quality environment will extend to all stakeholders, data domains, projects, and applications regardless of where the data resides, whether on premise, with partners, or in the cloud. For data quality to be most effective, it needs to be driven by a methodology that incorporates the characteristics defined above.

Related Documents:

May 02, 2018 · D. Program Evaluation ͟The organization has provided a description of the framework for how each program will be evaluated. The framework should include all the elements below: ͟The evaluation methods are cost-effective for the organization ͟Quantitative and qualitative data is being collected (at Basics tier, data collection must have begun)

Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange, PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange and Informatica .

Jun 14, 2019 · Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange, PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange Informatica

Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange, PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange Informatica

Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange, PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange Informatica

Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange, PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange and Informatica

Silat is a combative art of self-defense and survival rooted from Matay archipelago. It was traced at thé early of Langkasuka Kingdom (2nd century CE) till thé reign of Melaka (Malaysia) Sultanate era (13th century). Silat has now evolved to become part of social culture and tradition with thé appearance of a fine physical and spiritual .

PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange Informatica On Demand, Informatica Identity Resolution, Informatica Application Information Lifecycle Management, Informatica Complex Event Processing, Ultra Messaging, . Informatica Master Data .