Empirical Estimates Of Reliability

2y ago
9 Views
3 Downloads
423.70 KB
39 Pages
Last View : 6d ago
Last Download : 3m ago
Upload by : Farrah Jaffe
Transcription

Concept of ReliabilityThe concept of reliability is of the consistency orprecision of a measureWeight exampleReliability varies along a continuum, measures arereliable to a greater or lesser extentNot an all or nothing qualityNewsom, Psy 521/621 Univariate Quantitative Methods1

Concept of ReliabilityThe opposite of consistency and precision isvariability due to random measurement errorReliability is lack of random measurement errorRandom error is unexplained variation that is notsystematicIf variability is random, there will be someoverestimates and some underestimatesOn average estimate is accurateNewsom, Psy 521/621 Univariate Quantitative Methods2

3Concept of ReliabilityTarget asurement/validity and reliability.htmlNewsom, Psy 521/621 Univariate Quantitative Methods

4Theoretical FoundationsClassical Test Theory (CTT)Observed ScoreXo Note: many texts use X T ENewsom, Psy 521/621 Univariate Quantitative MethodsTrueScoreXt Error Xe

Theoretical FoundationsReliability is the proportion of the observed score22variance, so , that is due to the true score, st2The smaller the error variance, se ,the greaterproportion that is due to true score variance andthe higher the reliabilityIf proportion is 1.0, then no error variance – perfectreliabilityIf proportion is 0.0, then all error variance – noreliability and all noiseNewsom, Psy 521/621 Univariate Quantitative Methods5

Theoretical Foundations6TrueReliability True Errorst2Rxx 2 2st sest2 2soNote: your text uses Rxx as the symbol for reliability but most texts use ρxx (rho) or rxxNewsom, Psy 521/621 Univariate Quantitative Methods

AttenuationMeasurement error attenuates correlationsImagine if a score was only random errorIf observed scores are a function of true scores andmeasurement error, degree of error will cloud estimationof the relationship between two variablesExample: child’s age and reading abilityNewsom, Psy 521/621 Univariate Quantitative Methods7

8AttenuationRemember that measurement error will increase thevariance of the observed score, so the denominatorin the correlation coefficient will be largerThis makes the estimate of the correlation smaller inmagnituderxy( X X )(Y Y ) ( X X ) (Y Y )Newsom, Psy 521/621 Univariate Quantitative Methods22cxysx s y

Attenuationrxo yo rxt yt Rxx Ryyrxo yo is the correlation estimated from the data(between observed scores), rxt yt is the correlationbetween the true scores (if we could know them),Rxx and Ryy are the reliabilities of the two measuresNewsom, Psy 521/621 Univariate Quantitative Methods9

AttenuationExample 1: say the reliability for my guess at the age is.6 and the measurement of reading ability is .5 andthat the true score correlation is .4rxo yo rxt yt Rxx Ryy .4 (.5 )( .3 .4 (.548) .21.6 ) .4 When the true score correlation is .4, the estimatedcorrelation is .21—a substantial underestimate—almost half the value!Newsom, Psy 521/621 Univariate Quantitative Methods10

AttenuationExample 2: say the reliability for my guess at the age is.9 and the measurement of reading ability is .9 andthat the true score correlation is .4rxo yo rxt yt Rxx Ryy .4 (.9 )( .81 .4 (.9 ) .36.9 ) .4 When the true score correlation is .4, the estimatedcorrelation is .36—not nearly as badNewsom, Psy 521/621 Univariate Quantitative Methods11

12MeansRemember that random measurement error sometimes leadsto overestimates and sometimes leads to underestimatesOn average the estimate will be accuratefVariability around the averageAverageweightNewsom, Psy 521/621 Univariate Quantitative Methods

13MeansComparing meanst X1 X 2s12 s22 n1 n2If X1 and X2 observed scores have larger variance (s12 and s22)than their true score counterparts, then the denominator willbe larger and the t will be smaller, so less likely to be significantNewsom, Psy 521/621 Univariate Quantitative Methods

14MeansComparing meansd xo X o1 X o 2so21 so222Also seen in the estimate of the effect size, which gives themagnitude of the group difference (where o1 and o2 subscriptsindicate observed values for group one and two)Newsom, Psy 521/621 Univariate Quantitative Methods

Estimating ReliabilityTest-retest reliabilityRepeat the test two or more times to see how similar themeasurements areCalculate the correlation between the measurement occasionsProblem is that in the interval between the measurementoccasions the attribute may have changedSmall time interval needed in between measurements withoutcontamination from recallNewsom, Psy 521/621 Univariate Quantitative Methods15

Estimating ReliabilityParallel testsTwo tests are parallel if their true scores are the same and theyhave the same standard deviationTheoretical notion, because it is not possible to know withabsolute certainty that two tests are exactly parallelNewsom, Psy 521/621 Univariate Quantitative Methods16

Estimating ReliabilityAlternative forms reliabilityIf we could create two parallel or alternative forms of ameasure, we could estimate reliability of the measurewithout repeated measurementse.g., standardized tests, like the SAT and GRE, use alternative testformsNewsom, Psy 521/621 Univariate Quantitative Methods17

Estimating ReliabilitySplit-half reliabilityCan develop a larger test and correlate twohalvesProblem is how best to split up the teste.g., what if the first half and second half differ?Newsom, Psy 521/621 Univariate Quantitative Methods18

Estimating ReliabilityDomain sampling theory (model)What if we considered a set of items from a test to be from alarger pool (domain, population) of items from the same testWe could think of every item as a small parallel test, a testlet orsubtestNewsom, Psy 521/621 Univariate Quantitative Methods19

Estimating ReliabilityDomain sampling theory (model)If we view each item as good representations of the true scoreand each as a random selected item from a domain orpopulation of possible items, then we can relax theassumption that each test is strictly parallelInstead we only need to think of them as on average equallyrepresenting the domainNewsom, Psy 521/621 Univariate Quantitative Methods20

Estimating ReliabilityInternal reliabilityThe domain sampling idea allows us to use the correlationsamong items to gauge the reliability of a measureThis is the basis of internal reliability, such as the type ofreliability assessed by Cronbach’s alphaNewsom, Psy 521/621 Univariate Quantitative Methods21

Cronbach’s AlphaPreliminary steps Generate descriptive statistics, including means, standarddeviations (and/or variances, skewness and kurtosis) Obtain frequency tables and histograms Check for errors in entry, coding, etc. Variables do not need to be normally distributed, but when theyare highly skewed or kurtotic or they respondents have not usedthe full range of values, you may want to consider the wording ofthat item. Check correlations to confirm scoring direction is correct andpotentially eliminate items that are supposed to correlate that donotNewsom, Psy 521/621 Univariate Quantitative Methods22

23Cronbach’s AlphaCronbach’s alpha (Cronbach, 1951) is an estimate ofinternal reliability (sometimes called the “consistencycoefficient”)Conceptually based on the proportion of true score tototal observed score variancest2Rxx st2 se2Newsom, Psy 521/621 Univariate Quantitative Methodsst2so2

24Cronbach’s AlphaIf we can estimate the proportion of the observed scorevariance that is due to measurement error, then we canestimate reliabilityse2Proportion error 1 2soCronbach’s alpha (α) raw score form is:2k siα 1 2k 1 sX k number of items, si2 is the variance for each item, and sx2 is the variance forthe composite scale score (as a sum of the items)Newsom, Psy 521/621 Univariate Quantitative Methods

25Cronbach’s AlphaThe domain sampling model conceptualizes the items (testlets orsubtests) as retests, so that the average correlation betweenthese subtests is a measure of reliabilityCronbach’s alpha in the standardized form is:α krii ′1 ( k 1) rii ′rii ′ is the average correlation among all pairs of items, and k is thenumber of itemsNewsom, Psy 521/621 Univariate Quantitative Methods

Cronbach’s Alpha26The standardized coefficient alpha is the alpha for the set ofitems after they have been standardized (converted to zscores) and will be equal or higher than the raw score versionRaw score alpha assumes the variances of the of the items areequal, and if they are not, the raw score estimate will besmaller than the standardized estimateUsually similar, but when items are on very different scales (e.g.,some 5-point and some 9-point scales), the difference may belargerNewsom, Psy 521/621 Univariate Quantitative Methods

Cronbach’s AlphaComposites scores calculated by the sum or mean tend toweight items with larger variances more heavilyStandardizing items before computing the composite willequally weight them, because variances are all equal to 1In most applications, researchers do not bother to dostandardize items, sometimes because the original metric islost (e.g., average of items on a 7-point no longer between 1and 7, but are z-score values instead)Newsom, Psy 521/621 Univariate Quantitative Methods27

Cronbach’s AlphaWhat is an acceptable alpha? Exceeding .70 is widelymentioned as a cutoff for acceptable reliability, but what is“acceptable” or “good” depends heavily of theconsequences of using a measure with some certain level ofreliability.Many scales with an alpha of .70 can be improved, however.And this value has been grossly over applied and over stated.Newsom, Psy 521/621 Univariate Quantitative Methods28

Cronbach’s Alpha29The .70 criteria is commonly attributed to Nunnally (1978), a highly regardedpsychometrician, but using.70 as a standard was clearly not his intention:what a satisfactory level of reliability is depends on how a measure is being used. In the earlystages of research . . . one saves time and energy by working with instruments that have onlymodest reliability, for which purpose reliabilities of .70 or higher will suffice. . . . In contrast to thestandards in basic research, in many applied settings a reliability of .80 is not nearly high enough.In basic research, the concern is with the size of correlations and with the differences in means fordifferent experimental treatments, for which purposes a reliability of .80 for the differentmeasures is adequate. In many applied problems, a great deal hinges on the exact score made bya person on a test. . . . In such instances it is frightening to think that any measurement error ispermitted. Even with a reliability of .90, the standard error of measurement is almost one-third aslarge as the standard deviation of the test scores. In those applied settings where importantdecisions are made with respect to specific test scores, a reliability of .90 is the minimum thatshould be tolerated, and a reliability of .95 should be considered the desirable standard. (pp. 245246)Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill. Quote also given by: Lance, C. E., Butts, M. M., &Michels, L. C. (2006). The sources of four commonly reported cutoff criteria what did they really say?. Organizational researchmethods, 9(2), 202-220.Newsom, Psy 521/621 Univariate Quantitative Methods

Kuder-Richardson 20 (KR20)The KR20 (Kuder & Richardson, 1937) is a special case ofCronbach’s alpha when the items are binary (e.g., yes/no orcorrect/incorrect)It is equivalent to the raw score form of Cronbach’s alpha, socomputation of α for a set of binary items will give the sameresult as the KR20Newsom, Psy 521/621 Univariate Quantitative Methods30

Cronbach’s Alpha: Some Properties Cronbach’s alpha is an estimate of internal reliability orconsistency and does not indicate stability over timenecessarily Alpha is a lower bound estimate of reliability, and actualreliability may be higher Alpha is equal to the estimate of reliability from all possiblesplit halves Alpha assumes unidimensionality—if the measure reallyassesses more than one hypothetical construct (or factor),the estimate may be incorrect (lower than for each factor)Newsom, Psy 521/621 Univariate Quantitative Methods31

Cronbach’s Alpha: Some Properties A more heterogeneous group will have a higher alpha thana more homogeneous group, all other things equal Speeded tests may inflate alpha (Lord & Novick, 1968),related to the homogeneity phenomenon above Test length affects alpha—longer tests are more reliableConsider a single-item test vs. multiple item testThink about domain sampling—larger sample of items should bea better estimate of the population of itemsNewsom, Psy 521/621 Univariate Quantitative Methods32

Cronbach’s Alpha: Some PropertiesSpearman-Brown prophecy formulaRxx revised nRxx original1 ( n 1) Rxx originaln is the factor by which the size is increasedIf length is increased from a 10-item test is increased to 20items (with the same average inter-item correlation), n 2,because the length is increased by a factor of 2Newsom, Psy 521/621 Univariate Quantitative Methods33

Cronbach’s Alpha: Some PropertiesSpearman-Brown prophecy formulaIf length is increased from a 10-item test is increased to 20 items (with thesame average inter-item correlation), n 2, because the length is increased bya factor of 2. Assume the original reliability Rxx-original is .6.Rxx revised Newsom, Psy 521/621 Univariate Quantitative MethodsnRxx original1 ( n 1) Rxx original2 (.6 )1.2 .751 ( 2 1) .6 1.634

Cronbach’s Alpha: Some PropertiesSpearman-Brown prophecy formula (using average inter-itemcorrelation)RXX krii′1 ( k 1) rii′k is number of items, and rii′ is the average inter-item correlationNewsom, Psy 521/621 Univariate Quantitative Methods35

Cronbach’s Alpha: Some PropertiesSpearman-Brown prophecy formula (using average inter-itemcorrelation)RXX krii′1 ( k 1) rii′5 (.4 )2 .771 ( 5 1) .4 2.6Newsom, Psy 521/621 Univariate Quantitative Methods36

Cronbach’s Alpha: Some PropertiesSpearman-Brown prophecy formula (using average inter-itemcorrelation)RXX krii′1 ( k 1) rii′20 (.4 )8 .931 ( 20 1) .4 8.6Newsom, Psy 521/621 Univariate Quantitative Methods37

Cronbach’s Alpha: Some PropertiesFurr & Bacharach (2014, p. 151)Newsom, Spring 2018, Psy 521/621 Univariate Quantitative Methods38

Cronbach’s Alpha: Some Properties Does not indicate that alpha is “biased” by thenumber of items, but it may be difficult to reachacceptable reliability with short scales even if interitem correlation is fairly high Longer scales may still have high reliability eventhough some items are not so good Good idea to also look at average inter-itemcorrelation and item-total statistics because of thesensitivity to lengthNewsom, Psy 521/621 Univariate Quantitative Methods39

Estimating Reliability. Alternative forms reliability. If we could create two parallel or alternative forms of a measure, we could estimate reliability of the measure without repeated measurements. e.g., standardized tests, like the SAT and GRE, use alternative test forms

Related Documents:

Test-Retest Reliability Alternate Form Reliability Criterion-Referenced Reliability Inter-rater reliability 4. Reliability of Composite Scores Reliability of Sum of Scores Reliability of Difference Scores Reliability

Reliability Infrastructure: Supply Chain Mgmt. and Assessment Design for reliability: Virtual Qualification Software Design Tools Test & Qualification for reliability: Accelerated Stress Tests Quality Assurance System level Reliability Forecasting: FMEA/FMECA Reliability aggregation Manufacturing for reliability: Process design Process variability

Partial Estimates of Reliability: Parallel Form Reliability in the Key Stage 2 Science Tests In May 2008, The Office of the Qualifications and Examinations Regulator (Ofqual) launched its Reliability

Empirical & Molecular Formulas I. Empirical Vs. Molecular Formulas Molecular Formula actual/exact # of atoms in a compound (ex: Glucose C 6 H 12 O 6) Empirical Formula lowest whole # ratio of atoms in a compound (ex: Glucose CH 2 O) II. Determining Empirical Formulas You can determine the empirical formula

the empirical formula of a compound. Classic chemistry: finding the empirical formula The simplest type of formula – called the empirical formula – shows just the ratio of different atoms. For example, while the molecular formula for glucose is C 6 H 12 O 6, its empirical formula

posing system reliability into component reliability in a deterministic manner (i.e., series or parallel systems). Consequentially, any popular reliability analysis tools such as Fault Tree and Reliability Block Diagram are inadequate. In order to overcome the challenge, this dissertation focuses on modeling system reliability structure using

Evidence Brief: Implementation of HRO Principles Evidence Synthesis Program. 1. EXECUTIVE SUMMARY . High Reliability Organizations (HROs) are organizations that achieve safety, quality, and efficiency goals by employing 5 central principles: (1) sensitivity to operations (ie, heightenedFile Size: 401KBPage Count: 38Explore furtherVHA's HRO journey officially begins - VHA National Center .www.patientsafety.va.govHigh-Reliability Organizations in Healthcare: Frameworkwww.healthcatalyst.comSupporting the VA’s high reliability organization .gcn.com5 Principles of a High Reliability Organization (HRO)blog.kainexus.com5 Traits of High Reliability Organizations: How to .www.beckershospitalreview.comRecommended to you b

Pile properties: The pile is modeled with structural beam elements and can be assigned either linear-elastic or elastic-perfectly plastic material properties. Up to ten different pile sections can be included in a single analysis. Soil p-y curves: The soil is modeled as a collection of independent (Winkler) springs. The load-