Process For Data Quality Assurance

3y ago
48 Views
3 Downloads
2.50 MB
26 Pages
Last View : 3d ago
Last Download : 3m ago
Upload by : Evelyn Loftin
Transcription

“Good decisions require good data”Process forData Quality Assuranceat Manitoba Centre for Health Policy (MCHP)Mahmoud AzimaeeData Analyst at ICES

Literature and Resources CIHI Data Quality Framework, (2009edition) UK’s NHS Data Quality Reports Handbook on Data QualityAssessment Methods and Tools,(European Commission) Handbook on Improving Quality byAnalysis of Process Variables,(European Commission) Data fitness(Australian National Statistical Service)

Data Quality at MCHP1. Data Quality Indicators2. Rating System– CIHI Data Quality Framework,(2009 edition)3. Data Quality Report– UK’s NHS Data Quality Reports4. Practical Approach5. Automation– Cody’s Data CleaningTechniques Using SAS, (by Ron Cody)

Data Quality Indicators and RatingSystem Example:– Completeness: Rate of missing values for all dataelements. Consistency : Agreement with registry database.

MCHP Data Quality Framework:Data Quality AssuranceDatabase Level(In Data ectness(Invalid codes,Invalid Dates,Out of Range,Outliers sistencyStability its ofAnalysis(Persons,Places, Things,.)Level ofAgreementWith theLiterature andavailablereportsTimelinessTime toAcquisitionTime to ReleaseCurrency ofDataResearch Level(In a Specific Research Projects)InterpretabilityAvailability andQuality of:Documents ,Policies andProcedures,FormatsLibraries,Metadata,Data evel of BiasDegree ofProblems withConsistencyReliabilityLevel ofAgreementWith OtherDatabases

DataManagementProcess atMCHP1. Formulate the Request and Receive the DataCheck the datasharingagreementsLiaise with the source agency to acquire available data, data model diagram,data dictionary, documentation about historical changes in data content,format, and structure, data quality reportsPrepare thedata requestletterReceive the dataand associateddocumentation2. Become Familiar with Data Structure and ContentReview provided documentationIf required, create a data model for theoriginal dataIf receiving test data, test it and sendfeedback to the source agency3. Apply SAS ProgramsApply Normalization or De-normalization as requiredNormalization can be defined as the practice of optimizing tablestructures by eliminating redundancy and inconsistent dependencyApply data fieldand SAS formatstandardsInstall on SPD server(This includes indexing,sorting and clustering)CreateMetadataIf there is a problem,liaise with the sourceagency4. Evaluate Data QualityTest the installed data using standardizedprotocolIdentify solutions to address deficiencies indata qualityPrepare data quality report for addition tostandard documentation5. Document DataIncluding original documents, data model diagram, SPDS data dictionary, history, file variations and structural changes, revisions andcommon problems and data quality report, where available6. Release Data to Analyst(s) and Researcher(s)Meet with programmer(s) and researcher(s) to present data structure and content

How to Present Data QualityResults? CIHI Data Quality Report UK’s NHS Data Quality Report– VODIM Test Analysis Methodology ValidOtherDefaultInvalidMissing ValidInvalidMissingOutlierVIMO!

VIMO Table

(1) I just discovered that thedata system we have beenworking on for the last fiveyears has major data qualityproblems.(2) That is why I treat datasystems the same way I dosausage – I do not want to knowwhat is inside either one.(3) Ouch!! That is why I am avegetarian!Conversation from: Data Quality and Record Linkage Techniques, Thomas N. Herzog, et al. 2007, Springer

Operational Approaches Example 1: Identifying Outliers/ExtremeObservations:1.2.Standard Deviation (Mean /- 2*SD)Trimmed Standard Deviation(MeanTrimmed10% /- 2*1.49*SDTrimmed10%)3. Interquartile Range(Q1 – k*IQR , Q3 k*IQR), k 2.5– Ordered statistics for calculating quartiles is verymemory intensive P² method to approximate the quartiles(Using QMETHOD P2 in PROC MEANS)[piecewise-parabolic (P²) algorithm invented byJain and Chlamtac (1985)]

Operational ApproachesExample 2: Stability Across TimeBased on CIHI guideline:– Trend analysis is used to examine changes incore data elements over time– No change across years may also be anindication of a problem if the data is expected tonaturally trend upward or downward– Changes in methodology or inclusion/exclusioncriteria should be taken into account to determinewhether the observed changes were real or not.

Example 2: Stability Across Time (Continued) Identify unusual changes– Outlier analysis Outlier analysis requires a model– How to choose an appropriate model in anautomated fashion? Fit a series of common models:– Simple Linear: Y β0 β 1X– Quadratic: Y β 0 β 1X2– Exponential: Y β 0 β 1exp(X)– Logarithmic: Y β 0 β 1log(X)– SQRT: Y β 0 β 1 𝑥– Inverse: Y β 0 β 11𝑥– Negative Exponential: Y β 0 β 1Exp(-X)

Example 2: Stability Across Time (Continued) Choose the best model with the minimum MSE Re-fit the chosen model on the data Do an outlier analysis– Estimate Studentized residuals for each observation (with thecurrent observation deleted) Flag significant observations as potential outliers Flag observation with no changes over time How about Small Cell Size Policy? (0 Frequency 6)– Use the actual values in modeling but flag and thenforce them to 3 in the report

Automation MCHP’s data repository includes over 65health and other administrative databases,(linkable using a common encryptedindividual identifier). Annual updates for most of the databasesin its repository. Designing an automated process became amust!

Automation A SAS Macro based application packagewas developed (16 Macros)– Pre Data Quality Macro (1)– Main Macros (6)– Intermediate Macros (9)

roGETFORMATMacro(Continued)OUTLIERMacroSpecial Features: Can handle standalone and Clustered tables Can Validate Postal and Municipal codes

AutomationGETNOBSMacroLINKMacro(Continued)

Macro(Continued)

NCHECKMACRO Checks 3rd and 5th positions of PHINs which must be 0 and 9Compares the distribution of the first position with the correspondingPHINs from registry files

Non-Automated Indicators Internal Consistency Timeliness

Data Quality AssuranceDatabase Level(In Data ectness(Invalid codes,Invalid Dates,Out of Range,Outliers andExtremeObservations)VOMO MacroInternalValidityInternalConsistencyStability its ofAnalysis(Persons,Places, Things,.)Level ofAgreementWith theLiterature andavailablereportsTREND MacroLINK MacroPHINCHECKMacroAGREEMENTMacroTimelinessTime toAcquisitionTime to ReleaseResearch Level(In a Specific Research Projects)InterpretabilityAvailability andQuality of:Documents ,Policies andProcedures,FormatsLibraries,Metadata,Data evel of BiasDegree ofProblems withConsistencyReliabilityLevel ofAgreementWith OtherDatabases

Data Quality Website

Missing Links! Central Format Library Metadata Database Standardization– Bad standards are better than no standards at all!

Data Quality As A Science Data Quality Algebra Data Quality Axioms

Acknowledgment Mr. Mark Smith (MCHP Associate Director, Repository) Dr. Lisa Lix (Associate Professor at University of Saskatchewan)CONTACT INFORMATIONMahmoud AzimaeeInstitute for Clinical Evaluative SciencesWork Phone: (647) 480-4055 (Ex. 3618)E-mail: mahmoud.azimaee@ices.on.caWeb: www.dastneveshteha.com

Data Quality Assurance at Manitoba Centre for Health Policy (MCHP) Mahmoud Azimaee Data Analyst at ICES . Literature and Resources CIHI Data Quality Framework, (2009 edition) UK’s NHS Data Quality Reports Handbook on Data Quality Assessment Methods and Tools, (European Commission) Handbook on Improving Quality by Analysis of Process Variables, (European Commission) Data .

Related Documents:

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

10 tips och tricks för att lyckas med ert sap-projekt 20 SAPSANYTT 2/2015 De flesta projektledare känner säkert till Cobb’s paradox. Martin Cobb verkade som CIO för sekretariatet för Treasury Board of Canada 1995 då han ställde frågan

service i Norge och Finland drivs inom ramen för ett enskilt företag (NRK. 1 och Yleisradio), fin ns det i Sverige tre: Ett för tv (Sveriges Television , SVT ), ett för radio (Sveriges Radio , SR ) och ett för utbildnings program (Sveriges Utbildningsradio, UR, vilket till följd av sin begränsade storlek inte återfinns bland de 25 största

Hotell För hotell anges de tre klasserna A/B, C och D. Det betyder att den "normala" standarden C är acceptabel men att motiven för en högre standard är starka. Ljudklass C motsvarar de tidigare normkraven för hotell, ljudklass A/B motsvarar kraven för moderna hotell med hög standard och ljudklass D kan användas vid

LÄS NOGGRANT FÖLJANDE VILLKOR FÖR APPLE DEVELOPER PROGRAM LICENCE . Apple Developer Program License Agreement Syfte Du vill använda Apple-mjukvara (enligt definitionen nedan) för att utveckla en eller flera Applikationer (enligt definitionen nedan) för Apple-märkta produkter. . Applikationer som utvecklas för iOS-produkter, Apple .

critical issues the University has established a Quality Assurance Directorate, which is mandated to develop a Quality Assurance Framework and a Quality Assurance Policy. The Quality Assurance Framework would clearly spell out the Principles, Guidelines and Procedures for implementing institutional quality assurance processes.

This quality assurance manual specifies the methods to prepare and submit Quality Assurance Process Design Diagram for products and parts to be supplied to NSK by suppliers. 2. Purpose Each supplier should prepare quality assurance process design diagram clearly showing the quality assurance methods used in each products and parts production .

Quality Assurance and Improvement Framework Guidance 2 Contents Section 1: Quality Assurance and Improvement Framework 1.1 Overview 1.1.1 Quality Assurance (QA) 1.1.2 Quality Improvement (QI) 1.1.3 Access 1.2 Funding Section 2: Quality Assurance 2.1 General information on indicators 2.1.1 Disease registers 2.1.2 Verification