Norm-Referenced Tests vs Criterion-Referenced Tests
29 Questions
1 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is a characteristic of a good item in academic testing?

  • Answered correctly by high scorers and incorrectly by low scorers (correct)
  • Answered incorrectly by both high and low scorers
  • Answered correctly by both high and low scorers
  • Answered incorrectly by high scorers and correctly by low scorers
  • What is the purpose of item analysis in academic testing?

  • To analyze the test takers' confidence levels
  • To determine item difficulty and discrimination power (correct)
  • To assess the physical environment during testing
  • To evaluate the test's duration
  • What does the item-difficulty index measure?

  • The proportion of test takers who answered an item correctly (correct)
  • The proportion of test takers who answered an item incorrectly
  • The number of test takers who skipped the item
  • The time taken by test takers to answer an item
  • What is the purpose of the Item-Endorsement Index in other contexts?

    <p>To measure item reliability</p> Signup and view all the answers

    What does the index of item discrimination in item analysis indicate?

    <p>How well an item differentiates between high and low scorers</p> Signup and view all the answers

    Which statistical technique is used for selecting and rejecting items based on their difficulty value and discrimination power?

    <p>Item Analysis</p> Signup and view all the answers

    What does the Item-Reliability Index provide an indication of?

    <p>Internal consistency of a test</p> Signup and view all the answers

    How is the Average Difficulty of items calculated?

    <p>By taking the average of item difficulties</p> Signup and view all the answers

    What is the Guessing Probability for a five-option multiple-choice item?

    <p>0.20</p> Signup and view all the answers

    What does an Item-Difficulty Index of 0.53 suggest?

    <p>High internal consistency</p> Signup and view all the answers

    What is the purpose of calculating the Item-Reliability Index?

    <p>To measure internal consistency of a test</p> Signup and view all the answers

    What is the primary difference between norm-referenced tests and criterion-referenced tests?

    <p>Norm-referenced tests compare performance to peers, while criterion-referenced tests compare performance to an objective standard.</p> Signup and view all the answers

    What is the purpose of pilot work in test construction?

    <p>To determine how best to measure a targeted construct.</p> Signup and view all the answers

    Who is credited with being at the forefront of developing methodologically sound scaling methods?

    <p>L.L. Thurstone</p> Signup and view all the answers

    Which type of scale would be most appropriate if test-takers' performance as a function of grade is of critical interest?

    <p>Grade-based scale</p> Signup and view all the answers

    In test construction, what does scaling refer to?

    <p>Setting rules for assigning numbers in measurement</p> Signup and view all the answers

    What is the central concept behind norm-referenced instruments?

    <p>Comparing test-takers' performance to peers in a norming group</p> Signup and view all the answers

    What is the purpose of think-aloud test administration?

    <p>To shed light on the testtaker's thought processes during test administration</p> Signup and view all the answers

    What is the role of expert panels in test development?

    <p>To assess the fairness of test items</p> Signup and view all the answers

    What is the purpose of sensitivity review in test development?

    <p>To examine the presence of offensive language or stereotypes in tests</p> Signup and view all the answers

    How does an expert panel contribute to modifying a test?

    <p>By providing qualitative analyses of test items</p> Signup and view all the answers

    What is the primary function of using IRT in building and revising tests?

    <p>Ensuring accurate measurement of testtaker abilities</p> Signup and view all the answers

    Why do researchers caution against allowing testtakers to describe a test?

    <p>Because it is similar to allowing students to describe their instructors' problems</p> Signup and view all the answers

    What problem does the responsible test developer address in the test manual?

    <p>The problem of guessing by test takers</p> Signup and view all the answers

    What does ITEM FAIRNESS refer to?

    <p>The extent to which a test item is unbiased</p> Signup and view all the answers

    How can ICC and DIF be useful in identifying biased test items?

    <p>By analyzing the graphs and functioning of each item relative to the test takers' abilities and backgrounds</p> Signup and view all the answers

    How are biased test items identified in relation to specific groups of examinees?

    <p>By analyzing group abilities when differences are controlled</p> Signup and view all the answers

    What is one of the key requirements for an item to be considered fair to different groups of test takers?

    <p>The item should have similar ICC for different groups</p> Signup and view all the answers

    What does Differential Item Functioning (DIF) identify in specific items?

    <p>Biases exhibited by certain items in a statistical sense</p> Signup and view all the answers

    Study Notes

    Academic Testing

    • A good item is one that aligns with the expected performance of high and low scorers on the overall test.
    • Item analysis is used to select and reject items based on their difficulty value and discrimination power.

    Item Analysis

    • Purpose: To determine if an item is too easy or too difficult, and how well it discriminates between high and low scorers.
    • Tools:
      • Index of item difficulty
      • Index of item reliability
      • Index of item validity
      • Index of item discrimination
    • Distractor analysis is used to determine if all alternatives functioned as intended.

    Item Difficulty Index

    • Calculated by the proportion of test takers who answered the item correctly.
    • Also known as Item-Endorsement Index in personality testing.
    • Formula: Average p = (p1 + p2 + p3 + …p) / Total Number of Items

    Item Reliability Index

    • Provides an indication of the internal consistency of a test.
    • Calculated using: Item-Score Standard Deviation, Criterion Score, and Pearson Correlation.
    • Higher index suggests better internal consistency.

    Types of Tests

    • Norm-referenced test: Compares test-taker's performance to peers in a norming group.
    • Criterion-referenced test: Compares test-taker's performance to an objective standard.

    Test Construction

    • Scaling: Setting rules for assigning numbers in measurement.
    • Types of scales:
      • Age-based scale
      • Grade-based scale
      • Stanine scale

    Pilot Work

    • Preliminary research surrounding the creation of a prototype test.
    • Determines how best to measure a targeted construct.
    • A necessity for constructing tests for publication and wide distribution.

    Think-Aloud Administration

    • A qualitative research tool to shed light on test-taker's thought processes during test administration.
    • Respondents verbalize thoughts as they occur.

    Expert Panels

    • Provide qualitative analyses of test items.
    • Used to modify or revise the test.

    Sensitivity Review

    • A study of test items to ensure fairness to all prospective test-takers and absence of offensive language, stereotypes, or situations.

    Test Revision

    • A stage in new test development.
    • Used to address issues identified during item analysis.

    Item Fairness

    • Refers to the degree a test item is biased.
    • Many methods can be used to identify biased test items, including ICC and DIF.

    Biased Test Items

    • An item that favors one particular group of examinees over another when differences in group ability are controlled.
    • ICC and DIF can be used to identify biased items.

    Item-Characteristic Curves (ICC)

    • Used to identify biased items.
    • ICC for different groups should not be significantly different for an item to be considered fair.

    Differential Item Functioning (DIF)

    • Specific items are identified as biased in a statistical sense if they exhibit DIF.
    • Used to identify biased items.

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Description

    Learn about the differences between norm-referenced tests that compare performance to peers, and criterion-referenced tests that compare performance to an objective standard. Explore how these tests are used in licensing and educational contexts, and understand the development of criterion-referenced instruments.

    More Like This

    Use Quizgecko on...
    Browser
    Browser