Psych 31 Finals Exam (PDF)
Document Details
Uploaded by Deleted User
Tags
Summary
This document appears to be lecture notes on psychology, specifically test construction and interpretation of norms, along with statistical concepts related to testing.
Full Transcript
FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas Type of Group - Successive grade groups 3. PERCENTILE NORMS BASIC CONCEPTS IN TES...
FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas Type of Group - Successive grade groups 3. PERCENTILE NORMS BASIC CONCEPTS IN TEST CONSTRUCTION Type of Comparison - Percent of group surpassed & INTERPRETATION: NORMS by individual A raw score is meaningless without additional Type of Group - Single age or grade group to interpretive data. which individual belongs There is a need for a uniform frame of reference. 4. STANDARD SCORE NORMS NORMS Type of Comparison - Number of standard Norms constitute the most common way by deviations individual falls above or below average which psychological tests may be interpreted group through the use of a reference. Type of Group - Single age or grade group to Norms represent the test performance of the which individual belongs standardization sample. STATISTICAL CONCEPTS The norms are empirically established by Statistics is meant to organize and summarize determining what persons in a representative quantitative (i.e., pertaining to a large sum) data group actually do on the test. in order to facilitate understanding. Any individual’s raw score is then referred to the I. Descriptive Statistics distribution of scores obtained by the A. Univariate Descriptive Statistics standardization sample, to discover where the B. Bivariate Descriptive Statistics (Correlation) person falls in the distribution. Most common: Pearson r The raw score is converted into some - Studying the relationship between variable x relative measure. These derived scores have a and variable y: x ↔ y dual purpose: - Values: -1 to +1 1. To indicate the individual’s relative standing in - Absolute value: intensity of the relationship the normative sample, therefore permitting an - Algebraic sign: direction of the relationship evaluation of his or her performance in reference II. Inferential Statistics to other persons. We use descriptive statistics to infer 2. To provide comparable measures permitting a Theoretical tools: direct comparison of the individual’s - Normal curve performance on different tests. - Sampling distribution Ways that raw scores may be converted to fulfill - Null hypotheses (reject or accept null) the objectives: - z distribution 1. Developmental level attained: called - t distribution developmental norms - t-tests (correlated means or independent - e.g., age and group norms means) 2. Relative position within a specified group: called Statistical tools that help us examine the within-group norms relationship of one dependent variable and one or - e.g., percentile and standard score more independent variables: MAIN TYPES OF NORMS FOR EDUCATIONAL & - ANOVA PSYCHOLOGICAL TESTS - Regression 1. AGE NORMS STATISTICAL METHODS (techniques of managing Type of Comparison - Individual matched to masses of data) do one of two things: group whose performance he or she equals 1. Describe a set of data [Descriptive Methods] Type of Group - Successive age groups - they point up a characteristic of the group being 2. GRADE NORMS discussed. Type of Comparison - Individual matched to 2. Provide a basis for making generalization about group whose performance he or she equals a large group of individuals when only a selected 1 FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas portion of such a group has been observed MEASURES OF CENTRAL TENDENCY [Inferential Procedures] Because so many distributions cluster near the - they allow us to make inferences about large middle of the distribution: numbers of individuals when only a small sample – We think of this central point as representing the from the larger group has been observed. typical characteristic of the group. FOUR CLASSES OF MEASUREMENT SCALES – They identify that point in the distribution NOMINAL SCALE around which other scores seem to group. - merely assigning numbers to masses to identify MEASURES OF CENTRAL TENDENCY: MODE them. (eg. plate numbers, numbers on athlete’s That scale value that occurs more frequently uniforms) (than any other) in a distribution. - Does not tell us about varying degrees of quality Bimodal: two separate points compete to be ORDINAL SCALE designated the mode. - Larger scores reflect quantity or quality, but units MEASURES OF CENTRAL TENDENCY: MEDIAN along the scale are unequal in size (eg. the order That point in the distribution that divides the of winning a race: the difference in the distance total observations into two parts that are equal in between runner 1 & 2 is not necessarily equal to number. the difference in the distance between runner 2 50% of the cases fall above it; 50% of the cases &3) fall below it. INTERVAL SCALE MEASURES OF CENTRAL TENDENCY: MEAN - Have equal units Found simply by adding together the values of - Are more precise than ordinal scales, but we do the quantities we have and then dividing this sum not know where true zero is on the scale (eg. by the number of quantities that were so added. fahrenheit thermometer: zero does not mean an It is simply the arithmetic average. absence of heat) MEASURES OF VARIABILITY RATIO SCALE Whereas measures of central tendency tell us - Has equal units, but zero on the scale means an what typical performance is for a group, absence of the quality being assessed (eg. measures of variability tell us how widely scores measures of length & weight) are dispersed around that central point. FREQUENCY DISTRIBUTIONS Other terms: dispersion, spread, scatter One of the most fundamental techniques for Most widely used term: Variability putting order into a disarray of data. MEASURES OF VARIABILITY: RANGE It is a systematic procedure for arranging Its numerical value rests upon two, and only two, individuals from least to most in relation to some scores in a distribution: the most extreme scores. quantifiable characteristic. Simply computed by subtracting the smallest Constructed primarily for two reasons: score from the largest and adding one point. 1. They put the data into order so that a visual HS – LS + 1 analysis can be made of the results of the It establishes the amount of dispersion in a set of measurements that have been made. scores by noting how many scale points are 2. They provide a convenient structure for simple included from the lowest value to the highest, computations. inclusive. GRAPHIC REPRESENTATIONS OF DATA MEASURES OF VARIABILITY: STANDARD DEVIATION Histogram The best indicator of dispersion of scores that Frequency polygon begins with the mean, the most stable average Ogive (cumulative percentage data) – “lazy” S from sample to sample. shape Distances of scores above and below this central point are then calculated. 2 FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas Specifically, it is the root-mean-squared – The T-score scale places the mean raw score deviation. equal to a T score of 50 and equating the raw STANDARD DEVIATION FORMULA score SD to 10 Tscore points. 2 SD = √∑(x-¯x) /n – Thus, a score one SD above the mean (z score of MEASURES OF VARIABILITY: VARIANCE 1) would have a T score of 60, and a score, one SD below the mean (z score of -1) would have a T The mean-squared deviation score of 40. σ 2 = ∑(X-µ)2 / N PERCENTILES σ 2 = variance Can be expressed in terms of the percentage of ∑ (X - µ)2 = The sum of (X - µ)2 for all data points persons in the standardization sample who fall X = individual data points below a given raw score. µ = mean of the population Can also be regarded as ranks in a group of 100, N = number of data points except that in ranking, the best person in the SIGNIFICANCE OF MEASURES OF VARIABILITY group gets a rank of one. The magnitude of a given measurement or With percentiles, the lower the percentile, the similar observation is not readily apparent when poorer the individual’s standing. only a raw score is available. Percentages are raw scores expressed in terms However, when a raw score can be shown in of the percentage of correct items. terms of its deviation from the mean, its Percentiles are derived scores expressed in magnitude begins to be evident. terms of percentage of persons. From raw score to standard score… In psychological testing, there is frequent need ITEM ANALYSIS for changing raw-measurement data to some Many of the statistical operations involved in type of standard scale. dealing with tests, particularly during test Anytime we express a score in terms of its construction, have to do with item analysis, variation from the mean, with the standard wherever tests are composed of items or other deviation, or multiples of it, as the measure of small parts. variation, we have a standard score. GOALS OF ITEM ANALYSIS STANDARD SCORING METHODS include z scores 1. The improvement of total-score reliability or and T scores total-score validity (or both) and z scores 2. The achievement of better item sequences and – Nothing more than a raw score converted to types of score distributions. standard deviation units. We want to be sure that all items in a test are – Since standard deviations are measured from functioning—that they do something for us in the the mean, z scores begin at that point and range way of measurement or at least make some up and down the scale. contribution toward that end. – A score that is one SD above the mean would Item-analysis procedures, including appropriate have a z score of +1; if a raw score were a half SD statistical methods, enable us to differentiate below the mean, the z score would be -.5, and so between the better and the poorer items. on. RATIONAL APPROACH TO TEST DEVELOPMENT – We equate raw scores to a scale with a mean of The basic philosophies of test makers differ 0 and an SD of 1 considerably. On the one hand, there are those who prefer to T scores develop tests that measure recognized, basic – Take care of the inconvenience of working with psychological traits or variables. a scale that has zero in the middle and half the In other words, there are those concerned with scores are minus values. construct validity. 3 FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas EMPIRICAL APPROACH TO TEST DEVELOPMENT The criterion may be a recognized measure of a The empirical school of thought is commonly trait, when construct validity is of primary interest, represented among those who face practical or some measure of success in everyday life, problems of measurement. when predictive validity is of primary interest. Their immediate concern is to make testing instruments that will help to solve pressing ABILITY TESTING everyday problems. Criterion of success must be established TESTS OF INTELLECTUAL ABILITY (predictive validity). All psychological tests are designed to measure Both groups (rational and empirical) make use behavior. of empirical (experiential, experimental, observed, Hence the selection of proper tests and the practical) steps of item analysis in test interpretation of test results require knowledge development. COMMON ITEM STATISTICS about human behavior. 1. Item Difficulty and Popularity Familiarity with relevant behavioral research is 2. Item Correlations: needed not only by the test constructor but also a. Point biserial and biserial correlation by the test user. b. Item criterion correlations What can psychological research contribute to STATISTICS ON ITEM DIFFICULTY AND POPULARITY the understanding of the behavior measured by We want to know about the difficulty level of an tests of cognitive abilities or “intelligence”? item, if the test is one of ability or about the “popularity” level of an item, if the test is designed THE NATURE OF INTELLIGENCE to measure a nonaptitude trait. A convenient index of intelligence is the In this context, popularity does not mean “social intelligence quotient or the IQ. desirability”; it refers to the extent to which the It expresses intelligence as a ratio of mental age individuals of a population answer the item in the to chronological age: IQ = MA/CA x 100 keyed direction. MA is obtained by summing the number of items In both aptitude and nonaptitude tests, the basic passed at each level. index of difficulty and of popularity is the proportion of the individuals passing the item [Suggested by German psychologist William (answering in the keyed direction). Stern, adopted by Lewis Terman] The proportion passing item 1 (pi ) is the item NOTE THAT THE IQ IS NO LONGER CALCULATED mean. USING THIS EQUATION. ITEM CORRELATIONS (ITEM VALIDITY) Tables are used to convert raw scores on the More important than the problem of item test to standard scores that are adjusted so the difficulty are the questions concerning whether a test item discriminates individuals in line with mean at each age equals 100. other items in the test, whether the responses to IMPORTANT: the item predict some criterion, and whether the When considering the numerical value of a given criterion is the total score on the test of which it is IQ, one should always specify the test from which a part or some outside evaluation of individuals. it was derived. The problem is often known as that of item Different intelligence tests that yield an IQ differ validity: a case of construct validity when the in content and in other ways that affect the criterion is the total score and a case of predictive validity when the criterion is an outside measure. interpretation of their scores. Point biserial and biserial correlation SOME CONSIDERATIONS IN USING THE SYMBOL Another statistic of interest in item analysis is rit1 “IQ”: or the correlation of an item with the total score 1. Tested intelligence should be regarded as a from the test of which it is a part. descriptive rather than an explanatory concept. Used for this purpose are point biserial and An IQ is an expression of an individual’s ability biserial correlation. level at a given point in time, in relation to the Item Criterion Correlations The correlation of the item with some outside available age norms. criterion is ric1 4 FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas No intelligence test can indicate the reasons for SPEARMAN’S MODEL OF GENERAL MENTAL ABILITY one’s performance. Binet was not alone in his conception of general Intelligence tests should be used not to label mental ability. Before Binet, the idea was individuals but to help in understanding them. propounded by Francis Galton (1869) in his book 2. Intelligence is not a single, unitary ability, but a Hereditary Genius. composite of several functions. Working independently of Binet, in Great Britain, The term is commonly used to cover that Charles Spearman (1904, 1927) advanced this combination of abilities required for survival and idea: advancement within a particular culture. Intelligence consists of one general factor (g) Implications: plus a large number of specific factors. specific abilities in this composite vary with time SPEARMAN’S MODEL OF GENERAL MENTAL ABILITY and place. Spearman’s general mental ability, which he In different cultures and at different historical referred to as psychometric g (or simply g), was periods within that culture, the qualifications for based on the well-documented phenomenon successful achievement differs. that when a set of diverse ability tests are One individual varies from infancy to adulthood. administered to large unbiased samples of the To base decisions on tests alone, and especially population, almost all of the correlations are on one or two tests alone, is clearly a misuse of positive. tests. This phenomenon is called positive manifold, Decisions must be made by persons. which according to Spearman resulted from the Tests represent one set of data utilized in making fact that all tests, no matter how diverse, are decisions; they are not themselves influenced by g. decision-making instruments. FACTOR ANALYSIS At a broader level, all test results can be best To support the notion of g, Spearman developed understood within a contextual framework. a statistical technique called factor analysis—a DEFINING INTELLIGENCE: BINET’S VIEWPOINT method for reducing a set of variables or scores Alfred Binet defined intelligence as the capacity to a smaller number of hypothetical variables 1. To find and maintain a definite direction or called factors. purpose, Through factor analysis, one can determine the 2. To make necessary adaptations—that is, common variance of all factors. This common strategy adjustments—to achieve that purpose, variance represents the g factor. and Today, Spearman’s g is the most established 3. To engage in self-criticism so that necessary and ubiquitous predictor of occupational and adjustments in strategy can be made. educational performance. In developing tasks to measure judgment, IMPLICATIONS OF GENERAL MENTAL INTELLIGENCE attention, and reasoning, Binet was guided by two (g) principles. 1. A person’s intelligence can best be represented BINET’S PRINCIPLES OF TEST CONSTRUCTION by a single score, g, that presumably reflects the Principle 1: Age Differentiation shared variance underlying performance on a - Refers to the simple fact that one can diverse set of tests. differentiate older children from younger children 2. Differences in unique ability stemming from the by the former’s greater capabilities. specific task tend to cancel each other, and Principle 2: General Mental Ability overall performance comes to depend most - Refers to the total product of the various heavily on the general factor. separate and distinct elements of intelligence. THE gf-gc THEORY OF INTELLIGENCE Recent theories of intelligence have suggested that human intelligence can best be 5 FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas conceptualized in terms of multiple intelligences An individual receives a specific amount of rather than a single score. One such theory is credit for each item passed. called the gf-gc theory. The point scale offers an inherent advantage: According to this theory, there are two basic - It makes it easy to group items of a particular types of intelligence: fluid (f) and crystallized (c). content together (Binet did not do this until the Fluid intelligence can best be thought of as 1986 version). those abilities that allow us to reason, think, and By arranging items according to content and acquire new knowledge. assigning a specific number of points to each Crystallized intelligence represents the item, the Wechsler yielded not only a total overall knowledge and understanding that we have score but also scores for each content area. acquired. THE PERFORMANCE SCALE CONCEPT INDIVIDUAL TESTS The early Binet scale was criticized for 1. STANFORD-BINET INTELLIGENCE SCALE emphasizing language and verbal skills. For use from the age of two years to the adult Wechsler included a measure of nonverbal level intelligence: a performance scale. 15 tests representing five major cognitive areas: Consisted of tasks that require the person to do Fluid reasoning (FR) something (e.g., copy symbols or point to a Knowledge (KN) missing detail) rather than merely answer Quantitative reasoning (QR) questions. Visual/Spatial reasoning (VS) The performance scale attempts to overcome Working memory (WM) biases caused by language, culture, and 2. THE WECHSLER SCALES (original 1939) education. Why was the Wechsler developed? THE WECHSLER SCALES a. Overemphasis on speed in most tests; this Wechsler-Bellevue Scale (1939) handicaps the older adult - Wechsler Adult Intelligence Scale (WAIS) (1955) b. Routine manipulation of words received undue - WAIS-R (1981) weight in the traditional intelligence test - WAIS-III (1997) c. Inapplicability of mental age norms to adults; Wechsler Intelligence Scale for Children (WISC) few adults had previously been included in the (1949) standardization samples for intelligence tests. - Current edition: WISC-IV (2003) Although both Binet and Terman considered the Wechsler Preschool and Primary Scale of influence of nonintellective factors on results from Intelligence (WPPSI) (1967) intelligence tests, David Wechsler, author of the - Current edition: WPPSI (2002) Wechsler scales, has been perhaps one of the WECHSLER’S DEFINITION OF INTELLIGENCE most influential advocates of the role of Like Binet, Wechsler defined intelligence as the nonintellective factors in these tests. capacity to act purposefully and to adapt to the Throughout his career, Wechsler emphasized environment. that factors other than intellectual ability are In his words, intelligence is the aggregate or involved in intelligent behavior. global capacity of the individual to act POINT AND PERFORMANCE SCALE CONCEPTS purposefully, to think rationally and to deal Two of the most critical differences between the effectively with his/her environment. Wechsler and the original Binet scale were: Intelligence comprises several specific 1. Wechsler’s use of the point scale concept rather interrelated functions or elements and general than an age scale; intelligence results from the interplay of these 2. Wechsler’s inclusion of a performance scale. elements. THE POINT SCALE CONCEPT Credits or points are assigned to each item. 6 FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas OTHER INDIVIDUAL TESTS OF ABILITY ADVANTAGES DISADVANTAGES Both Binet & Wechsler Scales are exceptionally good instruments for assessing intelligence in Large-scale or mass Less opportunity to testing establish rapport, obtain relatively normal individuals. However, both have Eliminate need for a cooperation, and their limitations. Standardization samples do not one-to-one relationship maintain the interest of include individuals with sensory, physical, or Greatly simplifying the client examiner’s role Any temporary language handicaps. More uniform conditions condition (e.g., illness, Cattell Scales than in individual testing fatigue, worry) is less McCarthy Scales of Children’s Abilities (MSCA) Provision of better readily detected Kaufman Assessment Battery for Children established norms Lack of flexibility Testing of large, (individual tests typically (K-ABC) representative samples provide for the examiner DISADVANTAGES OF ALTERNATIVES in the standardization to choose items on the 1. Weaker standardization sample process is possible. basis of the test-taker’s 2. Less stable prior performance) 3. Less documentation on validity GROUP ABILITY TESTS: INTELLIGENCE, 4. Limitations in test manual ACHIEVEMENT, AND APTITUDE 5. Not as psychometrically sound Intelligence Tests: measure general ability; they 6. IQ scores not interchangeable with Binet or also predict future performance, but predict Wechsler generally and broadly. ADVANTAGES OF ALTERNATIVES Achievement tests: assess what a person has 1. Can be used for specific populations and learned. special purposes: Sensory limitations, Physical Aptitude tests: assess potential for learning. limitations, Language limitations, Culturally ACHIEVEMENT TESTS VERSUS APTITUDE TESTS deprived people, Foreign-born individuals, Achievement Tests Non-English-speaking people 1. Evaluate the effects of a known or controlled set 2. Not as reliant on verbal responses. of experiences. 3. Not as dependent on complex visual motor 2. Evaluate the product of a course of training. integration 3. Rely heavily on content validation procedures. 4. Useful for screening, supplement, and Aptitude Tests reevaluations 1. Evaluate the effects of an unknown, uncontrolled 5. Can be administered nonverbally. set of experiences. 6. Less variability due to scholastic achievement. 2. Evaluate the potential to profit from a course of training. 3. Rely heavily on predictive criterion validation INDIVIDUAL GROUP TESTS procedures. One subject is tested at a Many subjects are tested GROUP TESTS time at a time 1. Differential Aptitude Tests (DAT) Examiner records Subjects record own 2. *Raven Progressive Matrices (RPM) responses responses 3. *IPAT Culture Fair Intelligence Test Scoring requires Scoring is straightforward 4. Wonderlic Personnel Test (WPT) considerable skill and objective 5. *Purdue Non-Language Test (PNLT) 6. *Goodenough-Harris Drawing Test (G-HDT) Examiner flexibility can There are no safeguards elicit maximum 7. Kuhlmann-Anderson Test (KAT) performance if permitted 8. Henmon-Nelson Test (H-NT) by standardization 9. Cognitive Abilities Test (COGAT) *Nonverbal Group Ability Tests 7 FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas (or class interval), forming a series of STATISTICS REFRESHER contiguous rectangles SCALES OF MEASUREMENT Bar graph - numbers indicative of Continuous scales – theoretically possible to frequency appear on the Y -axis, and divide any of the values of the scale. Typically reference to some categorization (e.g., having a wide range of possible values (e.g. yes/ no/ maybe, male/female) appears on height or a depression scale). the X -axis. Discrete scales – categorical values (e.g. male or frequency polygon - test scores or class female) intervals (as indicated on the X -axis) meet Error – the collective influence of all of the factors frequencies (as indicated on the Y -axis). on a test score beyond those specifically measured by the test Nominal Scales - involve classification or categorization based on one or more distinguishing characteristics; all things measured must be placed into mutually exclusive and exhaustive categories (e.g. apples and oranges, DSM-IV diagnoses, etc.). Ordinal Scales – Involve classifications, like nominal scales but also allow rank ordering (e.g. Olympic medalists). Interval Scales - contain equal intervals between numbers. Each unit on the scale is exactly equal to any other unit on the scale (e.g. IQ scores and Measures of Central Tendency most other psychological measures). Central tendency - a statistic that indicates the Ratio Scales – Interval scales with a true zero average or midmost score between the extreme point (e.g. height or reaction time). scores in a distribution. Psychological Measurement – Most Mean - Sum of the observations (or test scores), psychological measures are truly ordinal but are in this case divided by the number of treated as interval measures for statistical observations. purposes. Median – The middle score in a distribution. DESCRIBING DATA Particularly useful when there are outliers, or Distributions - a set of test scores arrayed for extreme scores in a distribution. recording or study. Mode – The most frequently occurring score in a Raw Score - a straightforward, unmodified distribution. When two scores occur with the accounting of performance that is usually highest frequency a distribution is said to be numerical. bimodal. Frequency Distribution - all scores are listed MEASURES OF VARIABILITY alongside the number of times each score Variability is an indication of the degree to which occurred scores are scattered or dispersed in a distribution. Frequency distributions may be in tabular form. It is a simple frequency distribution (scores have not been grouped). Grouped frequency distributions have class intervals rather than actual test scores A histogram is a graph with vertical lines drawn at the true limits of each test score 8 FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas Distributions A and B have the same mean score STANDARD SCORES but Distribution has greater variability in scores A standard score is a raw score that has been (scores are more spread out). converted from one scale to another scale, where Measures of variability are statistics that the latter scale has some arbitrarily set mean and describe the amount of variation in a distribution. standard deviation. Range - difference between the highest and the Z-score - conversion of a raw score into a lowest scores. number indicating how many standard deviation Interquartile range – difference between the third units the raw score is below or above the mean of and first quartiles of a distribution. the distribution. Semi-interquartile range – the interquartile T scores - can be called a fifty plus or minus ten range divided by 2 scale; that is, a scale with a mean set at 50 and a Average deviation – the average deviation of standard deviation set at 10 scores in a distribution from the mean. Stanine - a standard score with a mean of 5 and Variance - the arithmetic mean of the squares of a standard deviation of approximately 2. Divided the differences between the scores in a into nine units. distribution and their mean Normalizing a distribution - involves “stretching” Standard deviation – the square root of the the skewed curve into the shape of a normal average squared deviations about the mean. It is curve and creating a corresponding scale of the square root of the variance. Typical distance standard scores of scores from the mean. CORRELATION AND INFERENCE Skewness - the nature and extent to which A coefficient of correlation (or correlation symmetry is absent in a distribution. coefficient) is a number that provides us with an Positive skew - relatively few of the scores fall at index of the strength of the relationship between the high end of the distribution. two things. Negative skew – relatively few of the scores fall Correlation coefficients vary in magnitude at the low end of the distribution. between -1 and +1. A correlation of 0 indicates no Kurtosis – the steepness of a distribution in its relationship between two variables. center. Positive correlations indicate that as one Platykurtic – relatively flat. variable increases or decreases, the other Leptokurtic – relatively peaked. variable follows suit. Mesokurtic – somewhere in the middle. Negative correlations indicate that as one THE NORMAL CURVE variable increases the other decreases. The normal curve is a bell-shaped, smooth, Correlation between variables does not imply mathematically defined curve that is highest at its causation but it does aid in prediction. center. Perfectly symmetrical. Pearson r: A method of computing correlation Area Under the Normal Curve when both variables are linearly related and The normal curve can be conveniently divided continuous. into areas defined by units of standard deviations. Once a correlation coefficient is obtained, it needs to be checked for statistical significance (typically a probability level below.05). By squaring r, one is able to obtain a coefficient of determination, or the variance that the variables share with one another. Spearman Rho: A method for computing correlation, used primarily when sample sizes are small or the variables are ordinal in nature. 9 FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas Scatterplot – Involves simply plotting one variable on the X (horizontal) axis and the other on the Y (vertical) axis Restriction of range leads to weaker correlations META-ANALYSIS Meta-analysis allows researchers to look at the Scatterplots of no correlation (left) and moderate relationship between variables across many correlation (right) separate studies. Meta-analysis - a family of techniques to statistically combine information across studies to produce single estimates of the data under study. The estimates are in the form of effect size, which is often expressed as a correlation coefficient. Scatterplots of strong correlations feature points tightly clustered together in a diagonal line. For PERSONALITY TESTING positive correlations the line goes from bottom Tests of mental ability were created to left to top right. distinguish those with subnormal mental abilities from those with normal abilities in order to enhance the education of both groups. However, it is not enough to know that a person is high or low in such factors as speed of calculation, memory, range of knowledge, and abstract thinking. To make full use of information about a person’s Strong negative correlations form a tightly mental abilities, one must also know how that clustered diagonal line from top left to bottom person uses those abilities. right. THE STUDY OF PERSONALITY The nonintellective aspects of human behavior, typically distinguished from mental abilities, are called personality characteristics. Personality is the relatively stable and distinctive patterns of behavior that characterize an individual and her or his reactions to the environment. STRUCTURED PERSONALITY TESTS Outlier – an extremely atypical point (case), lying Attempt to evaluate personality traits, relatively far away from the other points in a personality types, personality states, and other scatterplot aspects of personality, such as self-concept. Personality traits refer to relatively enduring dispositions—tendencies to act, think, or feel in a 10 FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas certain manner in any given circumstance and Deductive strategies comprise the that distinguish one person from another. logical-content and the theoretical approach. PERSONALITY TYPES & PERSONALITY STATES Empirical strategies comprise the Personality types - general descriptions of criterion-group and the factor analysis method. people Some procedures combine two or more of these - For example, avoiding types who have low social strategies. interest and low activity and cope by avoiding Applications in Clinical and Counseling Settings social situations. Deductive Personality states - emotional reactions that vary from one situation to another. SELF-CONCEPT a person’s self-definition, or, according to Rogers, an organized and relatively consistent set of DEDUCTIVE APPROACH TO CONSTRUCTING assumptions that a person has about himself or PERSONALITY TESTS herself. Deductive strategies use reason and deductive HISTORICAL DEVELOPMENT OF PERSONALITY logic to determine the meaning of a test TESTING response. Binet and others (Terman, Spearman, Thorndike) The logical-content method has designers believed that a person’s pattern of intellectual select items on the basis of simple face validity. functioning might reveal information about In the theoretical approach, test construction is personality factors. guided by a particular psychological theory. However, specific tests of personality were not EMPIRICAL APPROACH developed until World War I when there was a Empirical strategies rely on data collection and need to distinguish people on the basis of their statistical analysis to determine the meaning of a emotional well-being. test response or the nature of personality. Psychologists used self-report questionnaires These strategies retain the self-report features of that provided a list of statements and required the deductive strategies in that persons are asked people to respond in some way, e.g., “True” or to respond to items that describe their own views, “False” to indicate whether the statement applied opinions, and feelings. to them. However, empirical strategies use experimental SELF-REPORT: STRUCTURED PERSONALITY TESTS research to determine empirically the meaning of The general procedure in which the person is a test response, the major dimensions of asked to respond to a written statement is known personality, or both. as the structured or objective method of In the criterion-group approach, test designers personality assessment, as distinguished from the choose items to distinguish a group of individuals projective method. with certain characteristics, the criterion group, A clear and definite stimulus is provided and the from a control group. requirements for responding are evident and The factor analytic approach uses the statistical specific. technique of factor analysis to determine the For example, to respond “yes” or “no” to the meaning of test items. statement, “I am happy. All available structured personality tests STRATEGIES FOR STRUCTURED PERSONALITY TEST can be classified according to whether they use CONSTRUCTION one or some combination of the four strategies: Like measures of mental ability, personality Logical-content measures evolved through several phases. Theoretical Criterion-group 11 FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas Factor analytic “fake bad” – endorsing more items of THE LOGICAL-CONTENT STRATEGY pathological content than any person’s actual The first personality test ever developed was the problems could justify. Woodworth Personal Data Sheet (1920), based on “fake good” – avoiding pathological items. the logical-content strategy. Like the Woodworth, the purpose of MMPI & MMPI It was developed during World War I and 2 is to assist in distinguishing normal from published at the end of the war. Its purpose was to abnormal groups. identify military recruits who would likely break University of Minnesota Hospital patients down in combat. (n=800) divided into eight groups according to WOODWORTH QUESTIONS: psychiatric diagnosis, and compared with Items were selected from lists of known controls (n=700) composed of relatives and symptoms of emotional disorders and from visitors of the patients. questions asked by psychiatrists in a screening Final criterion groups: Hypochondriacs, interview. Depressed patients, Hysterics, Psychopathic Final form contained 116 items deviates, Paranoids, Psychasthenics, “Do you wet the bed at night?” “Do you usually Schizophrenics, Hypomanics feel in good health?” “Do you frequently ORIGINAL VALIDITY SCALES OF THE MMPI daydream?” “Do you usually sleep soundly at Lie scale (L) night?” - 15 rationally derived items to evaluate naïve A single score provided a global measure of attempt to present oneself in a favorable light. functioning. People who score high on L are unwilling to OTHER TESTS USING LOGICAL-CONTENT STRATEGY acknowledge minor flaws (weaknesses). Bell Adjustment Inventory - Example: “I never lose control of myself when I - Evaluated person’s adjustment in areas such as drive.” home life, social life, and emotional functioning. Infrequency scale (F) Bernreuter Personality Inventory - Items that are scored infrequently (less than - included items related to six personality traits 10%) by the normal population. High scores including introversion, confidence, and sociability. invalidate the profile. Mooney Problem Checklist (1950) - Example: “I am aware of a special presence that - Lists problems that recur in clinical case history others cannot perceive.” data and in written statements of problems listed K scale by 4000 high school students (U.S.). - 30 items that detect attempts to deny problems THE CRITERION-GROUP STRATEGY and present oneself in a favorable light. Main idea: assume nothing about the meaning - Individuals attempt to project an image of of a person’s response to a test item. self-control and personal effectiveness. Minnesota Multiphasic Personality Inventory by CALIFORNIA PSYCHOLOGICAL INVENTORY Hathaway and McKinley The CPI (1987) is a second example of a Minnesota Multiphasic Personality Inventory 2 structured personality test constructed primarily (1989) by the criterion-group strategy. Sample statements: “I like good food.”, ‘I never For 3 of the 36 CPI scales, criterion groups (men have trouble falling asleep.” vs. women, homosexual vs. heterosexual men) Raw scores are converted to T scores. were contrasted to produce measures of MMPI & MMPI 2 personality categorized as; Contains a validity scale that provides 1) introversion-extroversion, 2) conventional vs. information on the person’s approach to testing unconventional, and 3) self realization and sense of integration. 12 FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas In contrast to MMPI and MMPI 2, the CPI attempts The 171 were reduced to 36 dimensions, called to evaluate personality in normally adjusted surface traits. individuals. Subsequent investigation by factor analysis 20 scales each if which is grouped into one of produced 16 distinct factors which Cattell called four classes. source traits. - Class I: poise, self-assurance, interpersonal In subsequent factor analysis, items that effectiveness correlated highly with each of the 16 source traits - Class II: socialization, maturity, responsibility were included and those with low correlations, - Class III: achievement potential, intellectual excluded. efficiency SIXTEEN PERSONALITY FACTOR QUESTIONNAIRE - Class IV: interest modes (1972) 13 scales are designed for special purposes: Other parallel inventories developed: - managerial potential, tough-mindedness, High School Personality Questionnaire creativity Children’s Personality Questionnaire THE FACTOR ANALYTIC STRATEGY Clinical Analysis Questionnaire (CAQ) – for use Structured personality tests share one common with clinical populations set of assumptions: THE THEORETICAL STRATEGY Humans possess characteristics or traits that Items are selected to measure the variables or are static, vary from individual to individual, and constructs specified by a major theory of can be measured. personality. Nowhere are these assumptions better These questionnaires were based on Murray’s illustrated than in factor analytic strategy need theory: GUILFORD’S PIONEERING EFFORTS Edwards Personal Preference Schedule (1954) J.R. Guilford determined the interrelationship Personality Research Form (PRF) (1967) (intercorrelation) of a wide variety of tests and Jackson Personality Inventory (JPI) (1976) factor analyzed the results to find the main EDWARDS PERSONAL PREFERENCE SCHEDULE dimensions underlying all personality tests. (EPPS) Came up with the Guilford-Zimmerman Based on the need system proposed by Temperament Survey (1956) Alexander Murray (1936). 10 dimensions with 30 items each. Edwards selected 15 needs from Murray’s theory GUILFORD-ZIMMERMAN TEMPERAMENT SURVEY and constructed items with content validity for DIMENSIONS each. General activity Restraint Ascendance Edwards included a consistency scale to check (leadership) Sociability Emotional stability on validity of EPPS results. Objectivity Friendliness Thoughtfulness 15 pairs of statements are repeated in identical Personal relations Masculinity form. CATTELL’S CONTRIBUTION SELF-CONCEPT R.B. Cattell began with all adjectives applicable The set of assumptions individuals have about to human beings to determine the essence of themselves. personality. Q-sort Technique is based on Rogers’s theory of Allport and Odbert (1936) reduced an adjective the self. list from a dictionary to 4504 traits. - Set of cards with self-statements are sorted into Cattell added to this list traits found in two: psychological literature, and reduced the list to 171 - The first describes who the person really is (real items. self) College students then rated their friends on the - The second describes what the person believes 171 traits and the results were factor analyzed. he or she should be (ideal self) 13 FINALS EXAM PSYCHOLOGY ASSESSMENT (Lecture) PSYCH 31 | DR. ALVAREZ | SEM 1 | Pareñas -Rogers’s theory predicts that large discrepancies THE FIVE-FACTOR MODEL OF PERSONALITY between the real and ideal selves reflect poor Research with the NEO has supported five adjustment and low self-esteem. dimensions (considered the minimum number of COMBINATION STRATEGIES dimensions to describe the human personality): The modern trend is to use a mix of strategies for 1. Extroversion – the degree to which a person is developing structured personality tests. sociable, leader-like, and assertive as opposed to Indeed, most of the personality tests use factor withdrawn, quiet, and reserved. analysis regardless of their main strategy. 2. Neuroticism – the degree to which a person is NEO Personality Inventories is a good example of anxious and insecure as opposed to calm and a test of personality characteristics that relies on self-confident. a combination of strategies in scale development. 3. Conscientiousness – the degree to which a POSITIVE PERSONALITY MEASUREMENT person is persevering, responsible, and organized Early history of personality measurement as opposed to lazy, irresponsible, and impulsive. focused on negative characteristics such as 4. Agreeableness – the degree to which a person anxiety, depression, and other manifestations of is warm and cooperative as opposed to psychopathology. unpleasant and disagreeable. Research suggests advantages in evaluating 5. Openness to experience – the degree to which individuals’ positive characteristics to understand a person is imaginative and curious as opposed individual resources. to concrete-minded and narrow in thinking. - Kobasa (1979) studied “hardiness” FREQUENTLY USED MEASURES OF POSITIVE - Bandura (1986) studied “self-efficacy” – strong PERSONALITY TRAITS belief in the ability to organize resources and Rosenberg Self-Esteem Scale General manage situations. Self-Efficacy Scale Ego Resiliency Scale POSITIVE PERSONALITY MEASUREMENT AND THE Dispositional Resilience Scale Hope Scale Life NEO PERSONALITY INVENTORY-REVISED Orientation Test-Revised Satisfaction with Life (NEO-PI-R) Scale Positive and Negative Affect Schedule The developers of NEO-PI-R (Costa & McCrae, Coping Intervention for Stressful Situations Core 1985) used both factor analysis and theory in item Self-Evaluations development and scale construction. A multipurpose inventory for predicting interests, health and illness behavior, psychological well being, and characteristic coping styles. Based on a review of factor analytic studies and personality theory, the authors identified 3 broad domains: neuroticism (N), extroversion (E), and openness (O). Each domain has 6 facets. Neuroticism (N): anxiety, hostility, depression, self-consciousness, impulsiveness, and vulnerability. Extroversion (E): warmth, gregariousness, assertiveness, activity, excitement seeking, and positive emotions. Openness (O): fantasy, aesthetics, feelings (openness to feelings of self and others), actions (willingness to try new activities), ideas (intellectual curiosity), and values. 14