Week 4 lecture slides.pptx
Document Details
Uploaded by GroundbreakingEinsteinium6432
Taylor's University College
Tags
Full Transcript
PSYCHOMETRIC PROPERTIES OF TESTS: RELIABILITY PSY61204 Psychological Tests and Measurements Dr Michele Anne Overview What is reliability Types of reliability Test-retest reliability Parallel forms reliability Interrater reliability Internal consistency Errors What is r...
PSYCHOMETRIC PROPERTIES OF TESTS: RELIABILITY PSY61204 Psychological Tests and Measurements Dr Michele Anne Overview What is reliability Types of reliability Test-retest reliability Parallel forms reliability Interrater reliability Internal consistency Errors What is reliability? Reliability Consistency of measures or scores obtained from psychological tests Types of reliability Test-retest reliability Parallel forms reliability Interrater reliability Internal consistency Test-retest reliability Test-retest reliability Measure of stability of scores over time Involves administering test to a group of individuals and retesting them after an interval The scores collected at the different time intervals are compared for consistency Consistency is measured via correlation coefficient (Pearson or Spearman – depending on normality) Correlation coefficient of.70 and above is considered reliable Activity 1 Go to https://socrative.com (Student login) Room name: Anne4991 Factors affecting test-retest reliability Interval Shorter time intervals will yield more consistent scores (due to memory) as compared to long time intervals (due to significant changes) Motivation Test becomes boring when taking it again Participants may answer carelessly or randomly, increasing standard errors. Factors affecting test-retest reliability Practice effect For skill based test, doing it a second time may improve performance due to participants having had opportunity to practice before For cognitive test, they may remember their previous responses Variability / malleability of variable measured Some variables are state-based and less stable over time Difference is scores may reflect poor consistency of variable instead of tool Parallel forms reliability Parallel forms reliability Also known as alternate forms reliability Measure of degree to which changing the form of the questionnaire changes the response Involves developing 2 forms of the same test, and administering them at 2 different times (with counterbalancing) The 2 forms should be equivalent in terms of instructions, number of items, etc – but the actual items are different Correlation between scores of the 2 forms are conducted Factors affecting parallel forms reliability Item sampling Difference in items, although measuring the same area, may yield different results Temporal aspects Reliability is affected by time interval Difficulty developing second form Time consuming (second form must be as valid and reliable as first form – which would need testing) Difficulty to find enough items for a second form for some variables Interrater reliability Interrater reliability Measures the agreement between 2 raters or observers More often used for subjective scoring of measures Raters have to be trained and familiar with tool and scoring The percentage of agreement is correlated Kappa coefficient is the measure of agreement divided with agreement by chance Activity 2 Go to https://socrative.com (Student login) Room name: Anne4991 Factors affecting interrater reliability Expertise of raters If raters are unfamiliar with the subject matter, the knowledge may differ, and how they rate will also be different Subjective perception The world view, perception, and life experiences will effect how the raters rate Human factors Fatigue, mood, stress, etc Kappa coefficient guide Internal consistency Internal consistency Also known as split-half reliability Measure of degree to which items measure a single construct Divide the items into half (e.g., even numbered and odd numbered OR first half and last half) Calculate the scores for the 2 halves and correlate them, to show that internally the items are consistent towards the main measurement More items (in each half) which produce better reliability Cronbach’s alpha Value which helps to determine whether items measure the same construct If response is high for one item, it should also be high for the other items Cronbach’s alpha of.80 and above indicates good internal consistency Value of Cronbach’s alpha increases as number of items increase Cronbach’s alpha guide Factors affecting internal consistency Item sampling The first half items are not the same as the other half items Items have to be homogenous Method of splitting Splitting according to first and last half (e.g., first 50 and last 50 items) may be affected by fatigue and poor attention for the latter Splitting this way may also be effected by item difficulty, for scales with increasing difficulty Subscales Each subscale may measure a different area, and may contain items that will not correlate with items from other Errors Standard error of measurement Estimates how repeated measures using the same psychological test reflects the “true” score Due to “noise”, environmental / external factors, or variability within the individual The smaller the error, the more the scores obtained reflects the variable measured To calculate this, measures of standard deviation and coefficient from test-retest reliability are needed SEM = SD (√1 – reliability coefficient) Smaller SD and larger reliability coefficient will lead to smaller SEM Questions?