Psychological Assessment: Scorer Differences

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Which type of validity involves a judgment about the adequacy of the inferences drawn from test scores about individual's standing on a variable called construct?

  • Criterion Validity
  • Predictive Validity
  • Content Validity
  • Construct Validity (correct)

Which term describes the error where a rater's scores tend to cluster in the middle of the rating scale?

  • Central Tendency Error (correct)
  • Severity Error
  • Halo Effect
  • Leniency Error

Which validity examines the relationship between test scores and a criterion measure obtained at the same time?

  • Predictive Validity
  • Concurrent Validity (correct)
  • Construct Validity
  • Incremental Validity

Which of the following methods can be used to enhance the homogeneity of a test containing dichotomous items?

<p>Eliminating items with low correlation coefficients with total test scores (C)</p> Signup and view all the answers

Which concept refers to the validation of a test based on a different group from the original group?

<p>Cross-Validation (B)</p> Signup and view all the answers

What is the focus of factor analysis?

<p>Identifying factors or specific variables (B)</p> Signup and view all the answers

What describes a situation where the criterion measure includes irrelevant aspects of performance?

<p>Criterion Contamination (C)</p> Signup and view all the answers

In factor analysis, which method is used to test the degree to which a hypothetical model fits the actual data?

<p>Confirmatory Factor Analysis (D)</p> Signup and view all the answers

Which term refers to the extent to which a test is used in an impartial, just, and equitable way?

<p>Fairness (B)</p> Signup and view all the answers

What term describes the error where a rater inaccurately gives higher scores due to the inability to differentiate between distinct aspects of behavior?

<p>Halo Effect (D)</p> Signup and view all the answers

What measure is used to determine the level of agreement between two or more raters when the method of assessment is categorical?

<p>Fleiss Kappa (D)</p> Signup and view all the answers

Which theory posits that a person's test scores vary due to the variables in the testing situations?

<p>Generalizability Theory (D)</p> Signup and view all the answers

What is the main focus of Item Response Theory?

<p>Item difficulty (B)</p> Signup and view all the answers

Which measure should be assessed by two independent testing periods when dealing with Speed Tests?

<p>Test-retest reliability (D)</p> Signup and view all the answers

What does a true score genuinely reflect in Classical Test Theory?

<p>An individual's ability level as measured by a particular test (C)</p> Signup and view all the answers

What do tests designed to measure one factor usually exhibit?

<p>High degree of internal consistency (B)</p> Signup and view all the answers

Which concept is used for estimating how specific sources of variation contribute to the test scores?

<p>Domain Sampling Theory (A)</p> Signup and view all the answers

What information do Criterion-Referenced Tests provide?

<p>Test taker's performance relative to a specific variable or criterion (D)</p> Signup and view all the answers

According to Generalizability Theory, under what condition should the exact same test score be obtained?

<p>When all facets in the universe are the same (B)</p> Signup and view all the answers

Which theory focuses on the extent to which an item measures a specific trait?

<p>Latent-Trait Theory (D)</p> Signup and view all the answers

What type of error is caused by unpredictable fluctuations in measurement conditions?

<p>Random Error (A)</p> Signup and view all the answers

Which type of reliability is obtained by correlating pairs of scores from the same individuals on two different administrations of the test?

<p>Test-Retest Reliability (D)</p> Signup and view all the answers

Which type of error variance is associated with a test-taker's motivation or attention during test administration?

<p>Test Administration (A)</p> Signup and view all the answers

What effect occurs when the interval between test administrations is short, leading to inflated correlation?

<p>Carryover Effects (D)</p> Signup and view all the answers

When is it appropriate to use test-retest reliability?

<p>For tests measuring a stable attribute (B)</p> Signup and view all the answers

What is the main error type associated with parallel forms reliability?

<p>Item Sampling (B)</p> Signup and view all the answers

What does a lower correlation in a test-retest scenario with a longer interval indicate?

<p>Poor reliability (D)</p> Signup and view all the answers

Which method helps avoid carryover effects between parallel forms?

<p>Counterbalancing Technique (B)</p> Signup and view all the answers

Which type of score consistency would result from computer scoring of objective-type items?

<p>High reliability (A)</p> Signup and view all the answers

What does a test blueprint primarily ensure in psychological assessment?

<p>It ensures that the test is representative of a defined body of content. (B)</p> Signup and view all the answers

What is the true score formula used to calculate?

<p>True score (B)</p> Signup and view all the answers

What is the Content Validity Ratio (CVR) formula developed by Lawshe?

<p>$CVR = rac{N_e - N/2}{N/2}$ (C)</p> Signup and view all the answers

What should be done if the Content Validity Index (CVI) is low?

<p>Remove or modify items with low CVR values. (C)</p> Signup and view all the answers

What is an indication of Zero Content Validity Ratio (CVR) in a test item?

<p>Half of the experts rate the item as essential. (C)</p> Signup and view all the answers

Which type of validity is more logical than statistical?

<p>Content validity (D)</p> Signup and view all the answers

What type of evidence involves a validity coefficient showing a high correlation between test scores and an established test?

<p>Convergent evidence (D)</p> Signup and view all the answers

Which validity type involves comparing a test score at one time with a criterion measure obtained at the same time?

<p>Concurrent validity (C)</p> Signup and view all the answers

What does high Construct Validity indicate about a test?

<p>It accurately measures the theoretical construct it is intended to measure. (B)</p> Signup and view all the answers

Which method demonstrates construct validity by showing predictable score differences across groups?

<p>Method of Contrasted Groups (B)</p> Signup and view all the answers

Why might a test item be considered poor if high scorers on an academic test tend to get it wrong and low scorers get it right?

<p>It suggests the item is not accurately measuring the intended construct. (C)</p> Signup and view all the answers

What is a fixed cut score?

<p>A cut score derived from expert judgment concerning minimum proficiency (B)</p> Signup and view all the answers

Which method is used to set cut scores based on the composition of contrasting groups?

<p>Known Groups Method (A)</p> Signup and view all the answers

In a compensatory model of selection, what is assumed about high scores on one attribute?

<p>They can compensate for lower scores on another attribute (A)</p> Signup and view all the answers

Which method requires expert judges to use a well-defined and rational procedure to determine a pass mark?

<p>Angoff Method (B)</p> Signup and view all the answers

If a reliability coefficient is below 0.70, how is it interpreted?

<p>May have limited applicability (C)</p> Signup and view all the answers

Which reliability coefficient range is considered 'excellent'?

<p>90 and up (C)</p> Signup and view all the answers

Which method arranges items in a histogram according to their equivalent value?

<p>Item-Mapping Method (B)</p> Signup and view all the answers

What does the utility gain of a particular test estimate?

<p>The benefit of using the test (B)</p> Signup and view all the answers

What does a multiple hurdle selection process involve?

<p>Having a cut score for each predictor at multiple selection stages (C)</p> Signup and view all the answers

What validity coefficient range is considered 'very beneficial'?

<p>above 35 (A)</p> Signup and view all the answers

Which measure of central tendency is most commonly used for nominal data?

<p>Mode (A)</p> Signup and view all the answers

Which situation is a paired T-test used for?

<p>Comparing the means of two related groups (C)</p> Signup and view all the answers

Which statistical test would you use to compare means from more than two groups taken at more than three different times?

<p>ANOVA Mixed Design (D)</p> Signup and view all the answers

What does a large spread of values in a distribution indicate?

<p>Large differences between individual scores (C)</p> Signup and view all the answers

What measure divides the distribution into four equal parts?

<p>Quartile (D)</p> Signup and view all the answers

What correlation coefficient is used for ordinal data?

<p>Spearman rho (A)</p> Signup and view all the answers

Variance is equal to which of the following?

<p>The square root of the average squared deviations about the mean (B)</p> Signup and view all the answers

Which test would you use to compare the blood pressure of male and female graduate students?

<p>Independent T-test (A)</p> Signup and view all the answers

What type of data is most appropriately analyzed with the median?

<p>Ordinal (D)</p> Signup and view all the answers

What type of correlation is used for a true dichotomous variable and interval/ratio data?

<p>Biserial (A)</p> Signup and view all the answers

Which of the following best describes Level 1 interpretation?

<p>Minimal interpretation with data treated in a sampling or correlational way (B)</p> Signup and view all the answers

What is the primary characteristic of actuarial assessment?

<p>Application of empirically demonstrated statistical rules (A)</p> Signup and view all the answers

During a psychological assessment, who prepares evaluative critiques based on technical and practical aspects of tests?

<p>Test Reviewers (B)</p> Signup and view all the answers

Which term refers to an observable action or the product of an observable action?

<p>Overt Behavior (D)</p> Signup and view all the answers

Which party in psychological assessment is responsible for controlling the distribution of tests?

<p>Test Publishers (B)</p> Signup and view all the answers

What is a trait in psychological assessment?

<p>A distinguishable, relatively enduring way in which individuals vary from one another (B)</p> Signup and view all the answers

Which level of interpretation involves descriptive generalizations and hypothetical constructs?

<p>Level 2 (D)</p> Signup and view all the answers

Which statement about mechanical prediction is correct?

<p>It involves computer algorithms combined with statistical rules (D)</p> Signup and view all the answers

What is the primary focus of Level 3 interpretation?

<p>Full-scale exploration of personality, psychosocial situation, and developmental history (B)</p> Signup and view all the answers

What is extra-test history?

<p>Observations made by the examiner that are indirectly related to the test content (D)</p> Signup and view all the answers

What does the Item-Validity Index measure?

<p>The degree to which a test measures what it purports to measure (A)</p> Signup and view all the answers

What method assesses the correlation between a dichotomous variable and a continuous variable?

<p>Point-Biserial Method (C)</p> Signup and view all the answers

In scoring models, what does the Cumulative Model indicate about high scorers?

<p>They suggest a high level in the trait being measured (A)</p> Signup and view all the answers

What term describes the principle of revalidating a test on a sample other than the original test sample?

<p>Cross-validation (D)</p> Signup and view all the answers

Which phenomenon is described by a large percentage of respondents scoring near the lower limit on a test?

<p>Floor Effects (A)</p> Signup and view all the answers

What function does the Routing Test serve in Computerized Adaptive Testing?

<p>Directs the test-taker to a suitable level of items (C)</p> Signup and view all the answers

Which aspect is crucial in psychological assessment compared to psychological testing?

<p>Educational selection of tools (C)</p> Signup and view all the answers

What is the primary focus of an aptitude test?

<p>Potential for learning a specific skill (D)</p> Signup and view all the answers

What is the primary purpose of DIF Analysis?

<p>To identify items that function differently across groups (C)</p> Signup and view all the answers

What is meant by 'Scoring Drift'?

<p>Discrepancy between scoring in the anchor protocol and another protocol (B)</p> Signup and view all the answers

Which type of psychological assessment involves evaluation without the subject being in physical proximity?

<p>Remote (C)</p> Signup and view all the answers

Which model compares test-taker responses to different scales within the same test?

<p>Ipsative Scoring (A)</p> Signup and view all the answers

What characterizes a psychological test's scoring process?

<p>Reflects an evaluation of performance (A)</p> Signup and view all the answers

What distinguishes an intelligence test from an achievement test?

<p>Measurement of general potential (A)</p> Signup and view all the answers

What concept is described as the ability of a computer to tailor the test content based on prior responses?

<p>Item Branching (D)</p> Signup and view all the answers

Which type of assessment is described as encouraging therapeutic self-discovery?

<p>Therapeutic Assessment (B)</p> Signup and view all the answers

What does the psychometrics field specifically focus on?

<p>Psychological measurement (C)</p> Signup and view all the answers

What is typically measured by a typical performance test?

<p>Usual habits and behaviors (D)</p> Signup and view all the answers

Which of the following refers to assigning a summary statement of performance, usually numerical in nature?

<p>Score (D)</p> Signup and view all the answers

What is an example of a dynamic assessment approach?

<p>Sequential evaluation with intervention (A)</p> Signup and view all the answers

When should the Shapiro-Wilk test be used?

<p>When the sample size is less than 50 (C)</p> Signup and view all the answers

What does a p-value of 0.03 in Levene's Test signify?

<p>The variances are significantly different (C)</p> Signup and view all the answers

Which test is more sensitive to departures from normality?

<p>Bartlett's Test (D)</p> Signup and view all the answers

What is one of the outcomes of the pilot work in test development?

<p>Determining how best to measure a construct (C)</p> Signup and view all the answers

Which type of item format offers more than two alternatives?

<p>Polychotomous Format (C)</p> Signup and view all the answers

What is the primary purpose of Computerized Adaptive Testing?

<p>Tailoring test items based on performance (D)</p> Signup and view all the answers

Which scale arranges items from weaker to stronger expressions of attitude?

<p>Guttman Scale (C)</p> Signup and view all the answers

How does Levene's Test determine if variances are equal?

<p>By analyzing p-values (D)</p> Signup and view all the answers

Which process involves brainstorming ideas about the kind of test a developer wants to publish?

<p>Test Conceptualization (D)</p> Signup and view all the answers

What does an item pool refer to in test construction?

<p>A reservoir of potential test items (C)</p> Signup and view all the answers

Flashcards are hidden until you start studying

Study Notes

Psychological Assessment

Error: Scorer Differences

  • Evaluates the degree of agreement between two or more scorers on a particular measure
  • Calculated by determining the percentage of times two individuals assign the same scores to the performance of examinees
  • Variations: having two examiners test the same client using the same test and determining the closeness of their scores or ratings
  • Measures of scorer differences:
    • Fleiss Kappa: determines the level of agreement between two or more raters on a categorical scale
    • Cohen's Kappa: used for two raters only
    • Krippendorff's Alpha: used for two or more raters, correcting for chance agreement

Tests Designed

  • Homogenous tests: designed to measure one factor, expected to have a high degree of internal consistency
  • Dynamic tests: measure traits, states, or abilities that are fast-changing as a function of situational and cognitive experience
  • Static tests: measure traits, states, or abilities that are relatively unchanging
  • Restriction of range or variance: when the variance of either variable in a correlational analysis is restricted, resulting in a lower correlation coefficient

Power Tests

  • Designed to allow test-takers to attempt all items within a time limit
  • Measures a test-taker's ability to complete a task accurately and efficiently

Speed Tests

  • Contain items of uniform difficulty with a time limit
  • Reliability should be based on performance from two independent testing periods using test-retest, alternate-forms, or split-half reliability

Criterion-Referenced Tests

  • Designed to provide an indication of where a test-taker stands with respect to a criterion
  • As individual differences decrease, traditional measures of reliability also decrease, regardless of individual performance stability

Classical Test Theory

  • Assumes that everyone has a "true score" on a test
  • True score reflects an individual's ability level as measured by a particular test
  • Random error affects the observed score

Domain Sampling Theory

  • Estimates the extent to which specific sources of variation contribute to test scores
  • Considers problems created by using a limited number of items to represent a large construct

Test Reliability

  • Conceived as an objective measure of how precisely a test score assesses a domain
  • Reliability is a function of the proportion of total variance attributed to true variance

Generalizability Theory

  • Based on the idea that test scores vary due to variables in the testing situation
  • Universe: the test situation
  • Facets: number of items, amount of review, and purpose of test administration
  • Given the same conditions, the same test score should be obtained (universe score)

Decision Study

  • Examines the usefulness of test scores in helping test users make decisions

Systematic Error

  • Factors inherent in a test that prevent accurate, impartial measurement

Item Response Theory

  • The probability of a person with a certain ability level performing at a certain level on a test
  • Focuses on item difficulty

Latent-Trait Theory

  • A system of assumptions about measurement and the extent to which items measure a trait
  • Computers are used to focus on the range of item difficulty that helps assess an individual's ability level
  • If a person answers easy items correctly, the computer will move to more difficult items
  • Item attributes: difficulty, discrimination, and dichotomousness

Construct Validity (Umbrella Validity)

  • Covers all types of validity
  • Logical and statistical
  • Judgment about the appropriateness of inferences drawn from test scores regarding individual standing on a variable called a construct

Criterion Validity

  • More statistical than logical
  • Judgment about the adequacy of test scores in inferring an individual's standing on a criterion measure
  • Criterion: a standard on which a judgment or decision may be made
  • Characteristics: relevant, valid, uncontaminated
  • Types of criterion validity: concurrent, predictive, and incremental validity

Factor Analysis

  • Designed to identify factors or variables that are typically attributes, characteristics, or dimensions on which people may differ
  • Developed by Charles Spearman
  • Employed as a data reduction method
  • Used to study the interrelationships among a set of variables
  • Types of factor analysis: explanatory, confirmatory, and factor loading

Cross-Validation

  • Validation of a test to a criterion based on a different group from the original group
  • Validity shrinkage: a decrease in validity after cross-validation
  • Co-validation: validation of more than one test from the same group
  • Co-norming: norming more than one test from the same group

Bias

  • Factors inherent in a test that systematically prevent accurate, impartial measurement
  • Prevention: during test development through procedures such as estimated true score transformation

Rating

  • Numerical or verbal judgment that places a person or attribute along a continuum identified by a scale
  • Rating error: intentional or unintentional misuse of the scale
  • Types of rating error: leniency, severity, central tendency, and halo effect
  • One way to overcome rating errors is to use rankings### Discriminant Evidence
  • Definition: A validity coefficient showing little relationship between test scores and/or other variables
  • Importance: Used in Psychological Assessment

Measures of Central Tendency

Mode

  • Definition: The most frequently occurring score in a distribution
  • Use: For ordinal data, and for nominal scales and discrete variables
  • Characteristics: Can be used in analyses of qualitative or verbal nature, and gives an indication of the shape of the distribution

Measures of Spread or Variability

  • Definition: Statistics that describe the amount of variation in a distribution
  • Use: Gives an idea of how well the measure of central tendency represents the data
  • Characteristics: Large spread of values means large differences between individual scores

Measures of Spread or Variability

Range

  • Definition: The difference between the highest and lowest score
  • Use: Provides a quick but gross description of the spread of scores
  • Characteristics: Can be affected by extreme scores in the distribution

Variance

  • Definition: The square root of the average squared deviations about the mean
  • Use: Equal to the square root of the variance, and measures the distance from the mean
  • Characteristics: Equal to the arithmetic mean of the squares of the differences between the scores in a distribution and their mean

Measures of Location

Percentile or Percentile Rank

  • Definition: Expressed in terms of the percentage of persons in the standardization sample who fall below a given score
  • Use: Indicates the individual's relative position in the standardization sample
  • Characteristics: Not linearly transformable, converged at the middle, and the outer ends show large intervals

Quartile

  • Definition: Dividing points between the four quarters in the distribution
  • Use: Specific point in the distribution

Correlation

Spearman Rho

  • Definition: Used for ordinal + ordinal data
  • Use: Measures the correlation between two variables

Biserial

  • Definition: Used for true dichotomous + interval/ratio data
  • Use: Measures the correlation between two variables

Point Biserial

  • Definition: Used for nominal (true dichotomous) + nominal (true/artificial dichotomous) data
  • Use: Measures the correlation between two variables

Phi Coefficient

  • Definition: Used for artificial dichotomous + artificial dichotomous data
  • Use: Measures the correlation between two variables

Tetrachoric

  • Definition: Used for 3 or more ordinal/rank data
  • Use: Measures the correlation between two variables

Kendall's Rank Biserial Differences

  • Definition: Used for two separate groups, random assignment
  • Use: Measures the correlation between two variables

T-Test

  • Definition: Used for comparing means between two groups
  • Use: One group, two scores (e.g., blood pressure before and after the lecture)

T-Test Dependent (Paired T-test)

  • Definition: Used for comparing means between two groups
  • Use: One group, measured at least three times

One-Way ANOVA

  • Definition: Used for comparing means between three or more groups
  • Use: One group, measured at least three times

One-Way Repeated Measures

  • Definition: Used for comparing means between three or more groups
  • Use: Three or more groups, tested for two variables

Two-Way ANOVA

  • Definition: Used for comparing means between two or more groups, controlling for an additional variable
  • Use: Used when you need to control for an additional variable that may be influencing the relationship between your independent and dependent variable

Utility Gain

  • Definition: Estimate of the benefit of using a particular test
  • Use: In Psychological Assessment

Productivity Gains

  • Definition: An estimated increase in work output
  • Use: In Psychological Assessment

Cut Score

  • Definition: Reference point derived as a result of a judgment and used to divide a set of data into two or more classifications
  • Use: In Psychological Assessment

Relative Cut Score

  • Definition: Reference point based on norm-referenced considerations, not fixed per se
  • Use: In Psychological Assessment

Fixed Cut Scores

  • Definition: Set with reference to a judgment concerning minimum level of proficiency required
  • Use: In Psychological Assessment

Multiple Cut Scores

  • Definition: Refers to the use of two or more cut scores with reference to one predictor for the purpose of categorization
  • Use: In Psychological Assessment

Multiple Hurdle

  • Definition: Multi-stage selection process, a cut score is in place for each predictor
  • Use: In Psychological Assessment

Compensatory Model of Selection

  • Definition: Assumption that high scores on one attribute can compensate for lower scores
  • Use: In Psychological Assessment

Angoff Method

  • Definition: Setting fixed cut scores
  • Use: In Psychological Assessment

Known Groups Method

  • Definition: Collection of data on the predictor of interest from groups known to possess and not possess a trait of interest
  • Use: In Psychological Assessment

IRT-Based Methods

  • Definition: Cut scores are typically set based on test-taker's performance across all the items on the test
  • Use: In Psychological Assessment

Item-Mapping Method

  • Definition: Arrangement of items in a histogram, with each column containing items deemed to be equivalent in value
  • Use: In Psychological Assessment

Bookmark Method

  • Definition: Expert places a "bookmark" between the two pages that are deemed to separate test-takers who have acquired the minimal knowledge, skills, and/or abilities from those who are not
  • Use: In Psychological Assessment

Method of Predictive Yield

  • Definition: Took into account the number of positions to be filled, projections regarding the likelihood of offer acceptance, and the distribution of applicant scores
  • Use: In Psychological Assessment

Discriminant Analysis

  • Definition: Used to shed light on the relationship between identified variables and two naturally occurring groups
  • Use: In Psychological Assessment, used to analyze data when the criterion or dependent variable is categorical and the predictor or independent variable is interval in nature

Reliability and Validity

Reliability

  • Definition: The consistency of test scores
  • Interpretation:
    • Excellent: 0.90 and up
    • Good: 0.80-0.89
    • Adequate: 0.70-0.79
    • Limited applicability: below 0.70

Validity

  • Definition: The degree to which a test measures what it claims to measure
  • Interpretation:
    • Very beneficial: above 0.35
    • Likely to be useful: 0.21-0.35
    • Depends on circumstances: 0.11-0.20
    • Unlikely to be useful: below 0.11

Item Analysis

Item-Validity Index

  • Definition: Designed to provide an indication of the degree to which a test is measuring what it purports to measure
  • Use: In Psychological Assessment

Item-Discrimination Index

  • Definition: Measures the difference between the proportion of high scorers answering a question correctly and the proportion of low scorers answering it correctly
  • Use: In Psychological Assessment

Extreme Group Method

  • Definition: Compares people who have done well with those who have done poorly
  • Use: In Psychological Assessment

Discrimination Index

  • Definition: The difference between those proportions
  • Use: In Psychological Assessment

Point-Biserial Method

  • Definition: Correlation between a dichotomous variable and continuous variable
  • Use: In Psychological Assessment

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

More Like This

Psychology: Scaling Test Scores
100 questions
Psikologi Pengukuran Skor dan Interpretasi
51 questions
Use Quizgecko on...
Browser
Browser