2024 Human Mental Abilities Lecture 4 PDF
Document Details
Uploaded by WellRoundedRooster7984
University of Sydney
Kelly Dann
Tags
Summary
This lecture covers the key concepts of human mental abilities, including theory development, test evaluation, reliability, and validity. The lecture is presented in a format meant to aid teaching and learning.
Full Transcript
PSYC1002 HUMAN MENTAL ABILITIES Lecture 4 Kelly Dann [email protected] Overview What You Need to Know Theory development Describe the difference between – Single factor vs primary fluid and crystallized intelligence mental abiliti...
PSYC1002 HUMAN MENTAL ABILITIES Lecture 4 Kelly Dann [email protected] Overview What You Need to Know Theory development Describe the difference between – Single factor vs primary fluid and crystallized intelligence mental abilities Describe evidence that these are – Gf-Gc Theory different types of intelligence Describe how to separate true Test Evaluation score from error – Classical Test Theory Explain how reliability is – Reliability measured and describe different types of reliability – Validity Explain how validity is measured and describe different aspects of validity Theory development ▪ Single Factor ‘g’ (Charles Spearman, 1927) Positive manifold Good predictor of performance in real life Related to mental speed? Related to working memory capacity? Related to “good quality” brain? Theory development ▪ Primary Mental Abilities (Thurstone, 1938) 7 separate areas of mental ability revealed on tests o Verbal Comprehension o Inductive Reasoning o Numerical Fluency o Word Fluency o Spatial Ability o Memory o Perceptual Speed Relative levels differ among individuals They can be impaired in isolation after brain damage Theory development Hierarchical models of intelligence General Intelligence (g) Inductive Verbal Numerical Spatial ability reasoning Comprehension fluency etc… Gf-Gc Theory: Raymond Cattell (1941) Probably one of the most common theories in current use General Fluid Intelligence (Gf) the ability to grasp relations between things; deal with novelty non-verbal abilities, inductive and deductive reasoning culture-free (???!! in theory, but not in practice – assessment?) General Crystallized Intelligence (Gc) acquired knowledge and skills requires exposure to culture, formal/informal education may require some investment of fluid intelligence Performance on a single task can (and is likely to) require both Gf-Gc Theory: Raymond Cattell (1941) 1. General Fluid intelligence (Gf) – induction – sequential reasoning First Order Abilities: – quantitative reasoning Multiple tasks/tests are – temporal tracking used to measure these – figural reasoning 2. General Crystallized intelligence (Gc) – verbal comprehension – cognition of semantic relations – general information – reading comprehension First Order Abilities: Multiple tasks/tests are – spelling ability used to measure these – verbal closure – phonetic coding – foreign language aptitude Gf-Gc Theory: Cattell Although conceptually different, Gf and Gc correlate to varying extents Strongest evidence that they are different constructs: Show different developmental trends – Fluid rises to young adulthood, then falls off in old age – Crystallized rises and plateaus, roughly speaking Gc Ability Gf 10 30 50 70 Age Theory development Hierarchical models of intelligence General Intelligence (g) Fluid Intelligence Crystallised Intelligence (knowledge-independent) (knowledge-based) Inductive Arithmetic Verbal Spatial ability Comprehension … reasoning fluency Step 5 How do we assess our test? Goal of Psychological Assessment GOOD psychological assessment depends on: – how well we can measure the ability or trait of interest o is our test accurate? - Reliability o is our test measuring what we think it’s measuring? - Validity and – whether the answer is used in an appropriate way (is our use of the test valid?) - Validity ❖ We assume that the traits or states that we want to measure exist, they can be quantified, and they can be measured. (Not everyone agrees!) Reliability If a test measures a consistent trait in a person, then it should consistently produce the same answer. o It should not really be affected by random fluctuations If a test is reliable, it should be able to distinguish between people who differ on the construct. o Because any variation in performance is due to true differences in that ability, not random error Classical Test Theory Any observed score has two components: – The True Score (the real level of ability) X=T+E – and some Error component (random variance) Sources of error: 1. Test Construction e.g., choice of items/stimuli; content of the test 2. Test Administration e.g., variability in examiner; variability in examinee 3. Errors in Scoring e.g., failure to use ‘rubric’ consistently 4. Interpretation Subjectivity e.g., evaluation of response; True Score Theory / Classical Test Theory – OBSERVED SCORE – the actual “measurement” – consists of True score and Error – TRUE SCORE – is the ideal measurement – what we strive for – is CONSTANT for an individual X=T +E – ERROR – Errors in measurement – is RANDOM – unrelated to the true score (i.e., the “real” score) – cannot be eliminated completely – Remember – This is a model or theory and we can never be certain what the True score is (after all, that is what we are trying to estimate with our test)! Estimating the True Score Observation Observed True Error Number (X) Score (T) (E) X=T+E obs. 1 14.0 15.0 -1.0 obs. 2 16.0 15.0 1.0 obs. 3 15.0 15.0 0.0 We do not know the True score obs. 4 12.0 15.0 -3.0 obs. 5 17.0 15.0 2.0 We try to estimate it from taking obs. 6 12.0 15.0 -3.0 multiple measurements obs. 7 16.0 15.0 1.0 obs. 8 17.0 15.0 2.0 obs. 9 19.0 15.0 4.0 obs. 10 15.0 15.0 0.0 obs. 11 11.0 15.0 -4.0 Sample Mean 14.9 15.0 -0.1 … … … … If we could repeat … … … … measurement indefinitely, the obs. 13.0 15.0 -2.0 Long Term long term mean = the True Mean 15.0 15.0 0.0 score. Estimating the True Score Observation Observed True Error Number (X) Score (T) (E) X=T+E obs. 1 14.0 15.0 -1.0 obs. 2 16.0 15.0 1.0 obs. 3 15.0 15.0 0.0 We do not know the True score obs. 4 12.0 15.0 -3.0 obs. 5 17.0 15.0 2.0 We try to estimate it from taking obs. 6 12.0 15.0 -3.0 multiple measurements obs. 7 16.0 15.0 1.0 obs. 8 17.0 15.0 2.0 obs. 9 19.0 15.0 4.0 obs. 10 15.0 15.0 0.0 obs. 11 11.0 15.0 -4.0 The average of a sample of Sample Mean 14.9 15.0 -0.1 observations will approximate … … … … the True score. … … … … obs. 13.0 15.0 -2.0 Long Term Mean 15.0 15.0 0.0 A reliable test minimises the variance due to random error A reliable test consistently hits the bull’s eye How do we estimate reliability? 1. Test-Retest Reliability 2. Equivalent Forms Reliability 3. Cronbach’s Alpha And, there are others … 1. Test-Retest Reliability Same group of people are measured twice on same test the reliability of a test can be estimated as the correlation between repeated administrations of the same test Some issues: – In theory, changes in X from time 1 to time 2 are due to measurement error – We assume that the True score doesn’t change But is that true? Test-Retest Reliability: Issues – Picture Completion (WAIS) – A set of colour pictures of common objects and settings, each of which is missing an important part that the examinee must identify Test-Retest Reliability: Issues a) Carry-over effects – Remember original responses (particularly when retest interval is short) can over-estimate reliability b) Change in True Score Amount – Maturation e.g., spelling ability might improve over time; reasoning gets more sophisticated – Reactivity to the test If change is not systematic (different for different people) → low test-retest reliability Other ways of estimating reliability 2. Equivalent (alternative) Forms – Measure the same phenomenon using two different forms of the test Correlation between Form 1 & Form 2 is the reliability of the test – Or compare two halves of the test (split-half correlation) 3. Internal consistency – Cronbach’s alpha (α) – If every possible split-half correlation was computed, their average is called Cronbach’s Cronbach’s reflects the extent to which all items measure the same thing. Systematic (non-random) Error Variance Examples: – If a set of scales consistently gives readings 1 kg too light. – If an assessor consistently gives an extra (unwarranted) mark on an assignment Still hit the bull’s eye, but it’s the wrong eye! – Systematic error will not decrease the estimated reliability – But it will reduce the validity of the test – we are not measuring what we think we are measuring Validity Two aspects of validity: 1. Is our test measuring what we think it is measuring? o E.g., does it actually measure intelligence? 2. Is the test used appropriately, for its intended use? o E.g., if the test was developed for adults, is it being used to test children? Content Validity Coverage of the “Domain” Does the test assess behaviour that is representative of the domain of behaviour that we want to measure? Typically discussed in educational and achievement testing situations but also relevant in other areas Define the Boundaries & Structure of Domain – Boundary: What is considered part of the domain and what is not – Structure: Test content reflects the structure of the domain Construct Validity How well defined is the construct measured by this test? Convergent Validity Is the construct related to other theoretically Abstract similar constructs/tests? Reasoning* Expect high correlation with similar constructs/tests Verbal Reasoning Abstract Reasoning Discriminant Validity General Is the construct independent of other, Knowledge unrelated, psychological constructs? Expect low correlation with unrelated Personality constructs Reliability and validity in a nutshell Summary – How do we measure intelligence? Step 1: Work out what it is we want to measure (definitions – theoretical and pragmatic) Step 2: Work out what it looks like (signs, manifest variables) Step 3: Devise tests and scores Step 4: Work out how it is structured (e.g. single factor vs. multifactor) Step 5: Assess whether our test is valid and reliable