Full Transcript

Reliability 01/18/24 1 Reliability Reliability= the consistency of test scores A measure’s ability to produce consistent results Unreliable if we cannot count on something, somebody to behave consistently The extent to which a variable is being measured without error. Error-free measurem...

Reliability 01/18/24 1 Reliability Reliability= the consistency of test scores A measure’s ability to produce consistent results Unreliable if we cannot count on something, somebody to behave consistently The extent to which a variable is being measured without error. Error-free measurement is impossible! 01/18/24 2 Sources of Consistency and Inconsistency (Why do test scores vary at all?) Lasting and general characteristics of the individual. Some people do consistently better than others because they are just good at that task. E.g. Spelling ability Lasting but specific characteristics of the individual. Some people who are generally poor might nevertheless know how to spell many of the particular words included in the test. 01/18/24 3 Sources of Consistency and Inconsistency (Why do test scores vary at all?) Temporary but general characteristics of the individual. internal distractions (e.g., fatigue)  A person who is ill or tired might do poorly. Temporary and specific characteristics of the individual.  The test might contain words like Baltimore, Seattle. A child who took the test shortly after looking at sports news might have a temporary advantage. 01/18/24 4 Sources of Consistency and Inconsistency (Why do test scores vary at all?) Testing situation. external distractions (e.g., noises)  Getting lower scores in a noisy, poorly lit class Chance factors  Luck Momentary distraction 01/18/24 5 General Model of Reliability The extent to which individual differences in test scores are attributable to “true” differences or to chance errors. Holds that every score has two components: True score that reflects the examinee’s true skills, abilities, knowledge, etc.  A combination of all factors result in consistency in measurement. Error score 01/18/24 6 General Model of Reliability 01/18/24 7 General Model of Reliability Error variance: due to random, unsystematic factors. Any condition that is irrelevant to the purpose of test represents error variance. 01/18/24 8 General Model of Reliability The Reliability Coefficient The ratio of true score variance to the total variance of test scores rxx = 2T/2x rxx = 2T/(2T + 2e) 2 T x2 2  T o r 2 2  e  T rxx = .90 indicates ........  that 90% of the score variance is due to true score variance. 01/18/24 9 Reliability Estimates Estimates of true score variance There are 3 types we will discuss: 1.Test-retest 2.Alternate Forms 3.Internal Consistency Split Half Alpha Reliability 01/18/24 10 Test-retest The tendency of test to yield relatively similar scores for the same individual over time. Administer a test at two different points in time to the same group of people. Calculate the correlation between these scores (Pearson r between Time 1 score and Time 2 score) High correlation  temporal stability  Problem1 – Practice effects  Problem2 – some behaviors may fluctuate daily.  Problem3 – testing can be time consuming and expensive. 01/18/243 11 Alternate Forms Make up 2 equivalent forms of a test Give both to 1 group of people, at one time. Compute the correlation between Form A and Form B of the test (consistency between parallel tests) Practice effect is not a problem anymore since tests have different items. Problem1 – Order effects Problem2-difficult to develop several alternate forms. 01/18/24 12 Internal Consistency How much the individual items in a test “go together”, or whether they are all measuring the same thing. Based on the intercorrelations of the items, and the number of items. Two types of internal consistency:  1. Split-half  2. Alpha reliability 01/18/24 13 Split-Half Make a very large test and administer it to one group of people. Divide the items into 2 smaller tests. Two halves must be as similar as possible. Compute a correlation between the two halves. Only one test administration, practice effects are minimized. High correlation Internal consistency 01/18/24 14 Split-Half The simplest method is ______, Odd-even split. 01/18/24 15 Alpha reliability Coefficient alpha: represents the mean reliability coefficient one would obtain from all possible split-halves. rxx   k r ij  1   k  1 r ij k = number of items; k usually >1 rij= average intercorrelation among the items. 01/18/24 16 Alpha reliability First, administer the test to a group of people. Then, compute the correlations among all items and compute the average of those intercorrelations. Lastly, use the formula to estimate reliability. rxx 01/18/24   k r ij  1   k  1 r ij 17 Alpha reliability 01/18/24 18 Alpha reliability 01/18/24 19 Alpha reliability If there are more than one dimension, report an estimate of internal consistency for each homogeneous subtest or factor. Source: From Personality Assessment Inventory by L. C. Morey. Copyright © 1991. Published by Psychological Assessment Resources (PAR). 20 Alpha reliability If there are more than one dimension, report an estimate of internal consistency for each homogeneous subtest or factor. A hypothetical test for accountants that contained subtests for  calculation skills, and  use of a spreadsheet. Source: From Personality Assessment Inventory by L. C. Morey. Copyright © 1991. Published by Psychological Assessment Resources (PAR). 21 Alpha reliability Alpha values α ≥ 0.9 0.7 ≤ α < 0.9 0.6 ≤ α < 0.7 0.5 ≤ α < 0.6 α < 0.5 01/18/24 judgment Excellent (HighStakes testing) Good (Low-Stakes testing) Acceptable Poor Unacceptable 22 Split half vs. Alpha reliability The difference between them is in terms of unit of analysis. Split half compares _______ One half of the test with the other half. Alpha reliability compares ________ Each item with every other item. 01/18/24 23 Scorer Reliability and Agreement Scorer Reliability – the amount of consistency among scorers’ judgments. Examiner variance in clinical instruments. Scorer variance in projective techniques. Do not deal with Factors which we actually measure. Disturbing factors which can be controlled experimentally. 01/18/24 24 One might be interested in the stability over time, rather than in the stability of scores obtained by different 01/18/24 psychologists. 25 The Interpretation of Reliability Coefficients What is an acceptable level of reliability? .90? .70? Depends on the purpose of measurement. Applied work generally requires more. For theoretical research, .70 is usually OK. 01/18/24 26 Factors Affecting Reliability The test itself (e.g., sample of items, test length, typos, item quality, scoring difficulty) Administration conditions (vary across sessions, distractions) The scoring process (scorers make mistakes, scoring equipment faulty, scorers are inconsistent, scorers are biased) Test takers (motivation, fatigue) Test developers (poor definition of domain, biased coverage) Test-Retest Interval 01/18/24 27 Take-Home Message! To improve realibility; reduce the proportion of random error. To improve realibility; eliminate weak items, add good ones, adjust difficulty, increase administration time, better standardization. 01/18/24 28

Use Quizgecko on...
Browser
Browser