Summary

This document provides an overview of cognitive ability assessment in educational contexts. It covers different types of tests, including tests of achievement, ability, and intelligence, as well as various theoretical perspectives on intelligence. The document also includes information about the Stanford-Binet intelligence scales and the Wechsler scales.

Full Transcript

1 Section Eight: Assessment in Education - Part One: Cognitive Ability OL Psychology 4360 Psychological Assessment Section 8: Assessment in Education Contexts Cognitive Ab...

1 Section Eight: Assessment in Education - Part One: Cognitive Ability OL Psychology 4360 Psychological Assessment Section 8: Assessment in Education Contexts Cognitive Ability (text material from Hogan Ch. 8 Cognitive Abilities: Individual Tests; Ch. Cognitive Abilities: 9 Group Tests) (Ignore US data, legislation.) You are all experts in psychological assessment from the perspective of a learner in educational contexts. Use that experience to facilitate your understanding of the concepts in this section. A. Let me first make some distinctions among types of tests common in education. Recognize that while the terms make it seem that these types of tests are distinctly different from each other, we’ll discover that their characteristics overlap considerably. It’s best to think of these key terms as representing a continuum of ability through to achievement. Here are some definitions which emphasize the differences among them, however. 1. Tests of Achievement: measure learning accomplishment in specific conceptual or skill areas, following a program of instruction in these areas. 2. Tests of Ability or Aptitude: estimate one’s capacity for learning, usually something specific. Think “mechanical” or “clerical” aptitude as examples. A newer conceptualization includes a general Cognitive Ability tests for which are not necessarily distinguishable from 3. Tests of Intelligence: measure a more general ability to learn from experience, solve problems, and use knowledge to adapt to new situations, thereby reflecting the capacity for goal-directed adaptive behaviour. And here is a figure which shows them along the continuum. See Figure 11.1 Hogan p. 308 We can distinguish Intelligence from Ability or Aptitude by its being the most general among them. We’ll start at this most general end of the continuum to consider Intelligence. B. Intelligence Let’s start with a quick history: 2 Section Eight: Assessment in Education - Part One: Cognitive Ability OL Video: Intelligence2 https://www.youtube.com/watch?v=pmg2NEL7390 What is intelligence? 1. Intelligence has been a focus of interest and study for a very long time. For example, Galton (1883) saw intelligence as comprised of one’s sensory abilities which might include: speed of mental processing, perceptual speed, and reaction time. Some more recent investigators would say that purely physiological measures like blood glucose consumption and brain size are the marks of intelligence. Alfred Binet said this: “Intelligence is the tendency to take and maintain a definite direction, the capacity to make adaptations for the purpose of attaining a desired end and the power of autocriticism.” What cognitive factors would these involve predominantly? (purposive behavioural orientation, ability to initiate and sustain, flexibility to respond to feedback, capacity to engage in critical evaluation of one’s strategies) Wechsler (1958) defined intelligence as: “the aggregate/ global capacity to act purposefully, think rationally, deal effectively with the environment”. And, he said, don’t ignore “non-intellective” factors like? (drive, persistence, values) Here is a definition of intelligence from American writer F. Scott Fitzgerald: “The test of a first-rate intelligence is the ability to hold two opposing ideas in mind at the same time and still retain the ability to function. One should, for example, be able to see things are hopeless yet be determined to make them otherwise.” What additional cognitive and intellective factors might be implicated in this version of intelligence? 2. Factor analytic theories Recall from Research Methods class that factor analysis is a statistical operation which identifies the degree to which scores on different measures correlate among themselves. 3 Section Eight: Assessment in Education - Part One: Cognitive Ability OL How consistently would you find a high score on Text X and a high score on Test Y? Indeed, we find that scores on multiple tests of ability tend to be similar to a degree. Spearman (1927) saw this as evidence of a small general intelligence factor which he named “g”. It is indeed more common for someone testing high on one test of some ability or other to test high on others. There remains much of intelligence, though, which is not “g”, according to factor analysis. Spearman regarded “g” as a “general mental ability factor which underlies all intelligent behaviour”. He thought that there were other specific abilities which he labelled “s” and error, which he labelled “e”, making up the remainder of what accounts for how people vary in their intelligence, as measured by intelligence tests. The 2 Factor theory of intelligence was born. Intelligence equals g + s + e. Others, like Thurstone, were more convinced that there was a complex of “primary mental abilities” pointing not to “g” but to a collection of independent “intelligences”. Cattell (1941, 1971) (gave us the sub-components of g: Gc or crystallized intelligence, represents well-learned acquired skills and knowledge, and Gf designates fluid intelligence: active mental processing like analysis and problem-solving ability which doesn’t depend on language and is not the result of specific instruction but typically involves solving novel problems. Horn (1989), a colleague of Cattell, identified a number of further distinctions among intellectual abilities. Carroll (1997) (Hogan text pp. 260, 261) offered his “three-stratum” theory of cognitive abilities. The top stratum is g. the second stratum is comprised of sub-abilities and processes which include Gf, Gc, memory and learning, visual/auditory perception, retrieval, and speed. The third stratum involved sub-sub-abilities and processes. See Figure 8.3 p. 210 of the text. An integrative attempt has led to the Cattell-Horn-Carroll (CHC) model of cognitive abilities which characterizes the underlying construct of intelligence of the most widely used intelligence tests. 3. Information processing models Russian neuropsychologist Aleksandr Luria initiated the development of these models. He was less concerned with what information was processed than how it was processed. He postulated: 4 Section Eight: Assessment in Education - Part One: Cognitive Ability OL a. Simultaneous/parallel processing, where information is synthesized and integrated all at one time (e.g., social impressions, artistic appreciation), and b. Successive/sequential processing, where information is analyzed logically, step by step (e.g., in scientific investigation). C. Intelligence Testing 1. The Stanford-Binet intelligence scales (Hogan text pp. 225 – 227) At the turn of the 20th century, Alfred Binet and Theodore Simon were charged with designing a measure of intelligence which would identify those children in the Paris school system who were mentally “disabled”, in order to arrange for them remedial education. The test was based on two key concepts: first, that as children become older, they become more capable relative to younger children. This is the principle of Age Differentiation. Since Binet considered intelligence to involve attention, judgment and reasoning, he chose tasks for this test which would require these particular cognitive abilities, but his concern was the sum product of these abilities: the principle of General Mental Ability (g). a. The first Binet-Simon test, in 1905, comprised thirty items of progressively greater difficulty. 5 Section Eight: Assessment in Education - Part One: Cognitive Ability OL The authors determined various cut-off scores, which they thought represented the capability of children who are of average or below-average intelligence for their age. b. The 1908 revision introduced “age scales”, items which children of a certain age should be able to manage correctly were grouped together, age level by age level: the “mental level” which became the “mental age” (items in the group were passed by 80 – 90% of children at that age). This provided a measure of g. Here’s an example of one “basket” of small tests which a child aged from 2 years to 4 yrs. and 6 months should be able to do most of correctly on the 1970 version of the test. I tested a young girl who was almost 4 yrs. old and you’ll see that my client was able to do all the small tests in the baskets for age 3 and 4, missing just one of them at the 4yrs. and 6 month level. 6 Section Eight: Assessment in Education - Part One: Cognitive Ability OL The “mental age” of someone is the age at which the child performed most of the basket of small tests correctly. For example, was this at a typically nine-year-old level? Then, regardless of the chronological age (CA), the person’s Mental Age was nine. Stern gave us the “intelligence quotient” or IQ which was the MA divided by the CA. Terman, of Stanford University, revised the Binet-Simon test in 1916, and it became the Stanford Binet. He used a larger, but still non-representative sample for standardization, and defined I.Q. = MA / CA x 100. Here is my client’s S-B facesheet. You’ll see on the right side that we were in the period of overlap between the old S-B model of calculating a “ratio IQ” from Mental Age and Chronological Age, and the later “deviation IQ” approach of comparing her test scores against normative sample scores for interpretation of IQ. For practice: Converting the years to months, what is this young girl’s “ratio IQ”? (59 divided by 48 X 100 = 123) Compare this to her “deviation IQ” as derived from the normative data in 1972. 7 Section Eight: Assessment in Education - Part One: Cognitive Ability OL c. The test has been revised a number of times over the years: expanding the age range with which it could be used, adding non-verbal test items and re-norming the test. In the 1960, revision the old ratio I.Q. was dropped in favour of a “deviation I.Q.” This I.Q. was a standard score with mean 100 and standard deviation 16 (now 15). Compare the ratio IQ to her “deviation IQ” as derived from the normative data in 1972. In 1972, finally non-whites were included in the standardization sample. d. In the latest (fifth) edition, published in 2003, we now see this organization: See Table 8.8 p. 226 in our text. You can see here the influence of the CHC model of intelligence. What G factor measures do you see? 8 Section Eight: Assessment in Education - Part One: Cognitive Ability OL i/ Psychometric properties Standardization: First, there is a larger and more representative sample of persons aged two to eighty-five. This is meant to be representative of the population of the U.S. and is stratified according to gender, age, ethnicity, region, and education. Reliability: measures of internal consistency, test re-test reliability (five to eight days) and inter-scorer reliability are satisfactory to strong, typically above.9. The factor score reliabilities are also at least.9. Validity: Content-related evidence for validity was generated by expert analysts and through empirical item analysis. Criterion-related evidence, first of all, concurrent validity evidence, was established by comparing the Stanford-Binet 5th edition with recent Wechsler scales. Predictive validity was established by comparing the Stanford-Binet 5 with the Woodcock-Johnson subsequent achievement tests and the Wechsler Individual Achievement Test (WIAT). Construct validity was established through factor analytic studies. ii/ Test administration The latest Stanford-Binet is still structured using an age scale-type of format, with “baskets” of tasks with different content, grouped together on the basis of research which showed them as being at the same level of difficulty, in six increasingly difficult functional levels. “Routing measures” at the beginning of the test allow you to determine a suitable “start point” (a “basal” level) – not too easy, not too hard for the test-taker. One stops at the “ceiling level” when a maximum number of failures occur. Like the Wechsler tests we’ll look at next, once you have the raw scores for a person on all 10 subtests you go to tables for different age ranges and convert the raw scores to standard scores with a mean of 10 and s.d. of 3. The latest SB5 has a mean score of 100 and a standard deviation of 15 for the IQ (again, like the Wechsler tests). iii/ Scoring and interpretation The scoring manual is replete with instructions for standardized administration scoring and interpretation. The most general level of measurement is the full- scale I.Q.: 9 Section Eight: Assessment in Education - Part One: Cognitive Ability OL MEASURED IQ RANGE CATEGORY 145 - 160 very gifted or highly advanced 130 - 144 gifted or very advanced 120 - 129 superior 110 - 119 high average 90 - 109 average 80 - 89 low average 70 - 79 borderline impaired or delayed 55 - 69 mildly impaired or delayed 40 - 54 moderately impaired or delayed 2. The Wechsler tests: Wechsler Adult Intelligence Scale (Hogan text pp. 215 – 224) a. The design of the first Wechsler test was in part a reaction to the limitations of the Stanford-Binet. It was characterized by adequate adult norms, a point scale format, and non-verbal performance scale. In a point scale, items tapping the same cognitive domain are grouped together in graduated degrees of difficulty. b. Until the latest WAIS-IV, the verbal versus the performance portions of the test were scored separately and then combined to generate a full-scale I.Q. From the WAIS-III onward, factor scores were also available for these factors: Verbal Comprehension (VCI) (estimating Cattell’s Gc), Perceptual Reasoning (PRI) (estimating Cattell’s Gf), Working Memory (WM), and Processing Speed (PSI). The individual sub-tests are shown in Table 8.5 p. 217 of your text. The factor or index 10 Section Eight: Assessment in Education - Part One: Cognitive Ability OL scores are generated from the subtests shown in Table 8.4 on the same page. c. See the WAIS-IV facesheet on p. 222 of text for an overview of how scores are organized. From this information you should be able to identify what the underlying cognitive ability is labelled by the Composite score Indexes: VCI is obvious. PRI only taps Visual Perceptual Reasoning (think problem-solving), WMI refers to working memory (from simple attention span to more complex attentional function where you must hold information in awareness while manipulating it in some way), PSI is speed with which you can process visual information. d. Psychometric properties Standardization: The WAIS-IV has been standardized on a sample in excess of 2000 persons in the U.S.A. in a stratified sample procedure yielding what is regarded as a representative sample of the general population of that country. Canadian norms have also been generated and are more appropriately employed in a Canadian context. Reliability: The split half correlation coefficients were calculated for sub-tests except the timed/speed tests, which were evaluated for reliability using the test- retest format. The overall average reliability coefficients range from.90 to.97, for factors scores and for Full Scale I.Q. Validity: Concurrent convergent validity with other intelligence tests was established, with average validity coefficients ranging from.7 into the.9s Here is an example which shows the correlation between the WAIS-IV and the WAIS-III factor (composite) scores and Full Scale IQ (FSIQ). (Look at r12) 11 Section Eight: Assessment in Education - Part One: Cognitive Ability OL Question: If we seek concurrent/discriminant evidence of the WAIS-IV validity, which WAIS factors should correlate with a measure of AD/HD? Should that correlation be direct or inverse? (answer: WMI, inverse). What do we see in this table? Between which composite scores of the Brown AD/HD test and the WAIS IV are there the largest inverse correlations? (WMI and Effort, Memory and Total Brown scores) 12 Section Eight: Assessment in Education - Part One: Cognitive Ability OL Question: If we were to use the contrasting groups method of seeking validity evidence, what should we find comparing the intellectually gifted and a control group as far as their FSIQs are concerned? What do we see? ↑ vs. ↑ e. Test administration f. Scoring and interpretation See again the WAIS-IV facesheet on p. 222 of text The first level of analysis following scoring the sub-tests is to find the scaled score equivalent to the raw score. Using the manual tables you locate the person’s age group which gives you these equivalent scaled scores. The normally-distributed standard scores for any sub-test have a mean of 10 and standard deviation of 3. The summed scaled scores are combined in the patterns you see here, to generate the Full-Scale I.Q. and factor scores. These normally-distributed I/Q. and Index scores have a mean of 100 and standard deviation of 15. 13 Section Eight: Assessment in Education - Part One: Cognitive Ability OL Note on the facesheet that the various composite scores are given a percentile rank based on the theoretical normal distribution you’ve already studied. Note also the Confidence Intervals around the obtained scores, based on the tests’ reliability coefficients and SEmeasure. Familiar territory for you, eh?! The full-scale I.Q. scores are interpreted according to the following classification: 3. Wechsler Intelligence Scale for Children (text pp. 223,224) For those of you who expect to work with children, especially in school contexts, you should acquaint yourselves with the latest WISC. There is a Canadian version which was published in 2014 and is normed on a representative Canadian sample of young people aged six years to sixteen years, eleven months. As with the WAIS-IV, it generates FSIQ, VCI, PRI and WMI. The PSI index is now two: Visual Spatial and Fluid Reasoning. It has 21 subtests (up from 13 on the WISC-IV) including other subtest, process, and index scores intended for additional clinical uses. 14 Section Eight: Assessment in Education - Part One: Cognitive Ability OL 4. Wechsler Pre-School and Primary Scale of Intelligence - IV (WPPSI-IV) The latest Canadian edition was published in 2012 and allows testing of children from two years, six months to 7 years 7 months. It generates the same IQ and Index scores as the WISC-V plus five additional scale scores tapping different aspects of cognitive ability. 5. Wechsler Abbreviated Scale of Intelligence II (WASI-II) Note that this short form of the WAIS is suitable for screening only. It includes two sub-scale versions (block design and vocabulary) and a four sub-test version (block design plus vocabulary plus similarities plus matrix reasoning). You’re going to see when we get to the Psychoeducational batteries, especially, that comprehensive tests of cognitive ability look ever so much like the Intelligence tests we’ve been considering. There is indeed a confluence among all of these tests of general cognitive ability as to what comprises this construct. You see how more and more similar the Wechsler scales and the Stanford Binet have become and you’ll recognize the same components in the Woodcock- Johnson. D. Aptitude/Ability tests Aptitude or Ability tests can be construed in a number of ways but they typically are meant to estimate one’s capacity for learning, usually something specific. Therefore, they should have good predictive validity. Where does our “aptitude” for something come from? (innate abilities and lifelong learning) We would distinguish aptitude from intelligence by considering aptitude to be more specific than intelligence but not as tied to specific educational content as our achievement tests. 1. Primary School level Aptitude tests at this level may be used essentially as an Intelligence test, with subscores identifying areas of average, below or above ability on different features of cognitive ability. A widely used instrument is the Woodcock-Johnson IV Test of Cognitive Ability (text p. 325), a component of the WJIV Psychoeducation Battery. OH: Test of Cognitive Ability You see the tests in the top section and the types of cognitive ability which these tests measure in the next section. You see how similar these abilities are to what we’ve seen in the intelligence tests? 15 Section Eight: Assessment in Education - Part One: Cognitive Ability OL A Canadian Test is the CCAT 7. It tests reasoning, specifically: Go to: https://www.nelson.com/assessment Select Classroom from menu. Future teachers and school counsellors, psychologists, especially, watch the Intro video on the CCAT 7 page. 2. Secondary school level Often aptitude tests at this stage concern what might be a suitable career direction. They may be used for post-secondary admission purposes. One such test is the Scholastic Assessment Test (SAT) widely used in the U.S.A. (See text pp. 257-259) Interestingly for our consideration of ability versus achievement, the original name of the SAT was the Scholastic Ability Test, but recognizing the role of learning experience on their test, the name was changed to its current one which was meant to signal that ability and achievement could not reasonably be separated out. The SAT contains both verbal and quantitative reasoning tests. See Table 9.6 in the text (p. 259) for the latest SAT components When we consider the development of the SAT some important themes emerge which are relevant to us in Canada. Tests change to reflect changes in what is valued at a particular time in educational thought as a feature of the culture in general. For example, in the SAT verbal portion, there has been an increased emphasis on critical reading rather than simple vocabulary range; in the mathematics portion, there is an increased emphasis on reasoning versus computation. Investigations of earlier versions of the SAT revealed that it was actually a poor predictor of college grades for students in the middle of the range of scores. Question: Why do you think this might be? 16 Section Eight: Assessment in Education - Part One: Cognitive Ability OL (non-intellective factors, motivation, life skills, psychological/personal functioning, home and social life have more of an impact on these students’ college achievement). SAT scores have been found to correlate with socio-economic status and urban location of schools attended which continues to raise questions about why. Are these evidence of construct-irrelevant test bias or the effects of poorer school environments for some young people versus others? However, the SAT together with cumulative GPA has been found to have a useful degree of predictive validity regarding ultimate success in college. 3. College level and beyond To begin with, as you probably are aware, graduate and professional school entrance tests are ubiquitous and are used to screen out or select students from applicants. a. Graduate Record Examination (GRE) (text pp. 264 - 269) This is the test of general scholastic ability, typically used as part of a collection of applicant information. There are verbal, quantitative, and analytic reasoning components. In addition, there are achievement tests in twenty majors, including psychology. Go to www.ets.org Select GRE on the menu at the bottom of the page. Then choose Prepare for the GRE General Test. Then choose Free GRE General Test Preparation Materials. In the first paragraph are three choices: Overview of the Verbal Reasoning measure, Overview of the Quantitative Reasoning measure and Overview of the Analytical Writing measure. Choose one and see the types of questions included - choose one and try Sample Questions. The GRE has a mean of 500 and a standard deviation of 100. Test reliabilities are high (mostly.90+). There is mixed evidence as to the predictive validity of the GRE. As you’ll see in Table 9.11 of the text (p. 268 - discussed on p. 269) the highest validity correlation with first year grad school GPA comes with the use of all of the three components of the GRE plus the undergraduate GPA in a multiple regression. Researchers have identified a false-negatives problem. Question: What does this mean? See Figure 5.7 on p. 142 in your text. (Low-scoring GRE in notable quantities succeed, as far as college GPA, contrary to prediction.) Differential prediction of GPA from GRE scores has also been found, for example, House (1998, as cited in Kaplan and Saccuzzo, 2005) found that the GRE over-predicted GPA achievement of younger students and under-predicted GPA achievement of older 17 Section Eight: Assessment in Education - Part One: Cognitive Ability OL students. Question: What might account for this? Remember, we are comparing GPA outcomes for different groups of students who have the same score on the predictor, the GRE. (non-intellective/past achievement factors) There has been an interesting change in GRE scores over the past decades. There has been, overall, a decline in the verbal scores and an increase in the quantitative and analytic scores. Question: How might this be accounted for? (More ESL students, changes in education policy and therefore focus, socio-cultural factors?) Preparation programs for the GRE are as yet unproven in their effects. They are likely helpful to students who do not have well-developed test-taking skills and for those who would tend to score lower for other reasons. c. Other aptitude tests Other aptitude tests used for admissions/screening according to professions may be seen in: Table 9.10 of text p. 265: Examples of Tests Used for Graduate and Professional School Selection.

Use Quizgecko on...
Browser
Browser