Podcast
Questions and Answers
What is a characteristic of a good item in academic testing?
What is a characteristic of a good item in academic testing?
What is the purpose of item analysis in academic testing?
What is the purpose of item analysis in academic testing?
What does the item-difficulty index measure?
What does the item-difficulty index measure?
What is the purpose of the Item-Endorsement Index in other contexts?
What is the purpose of the Item-Endorsement Index in other contexts?
Signup and view all the answers
What does the index of item discrimination in item analysis indicate?
What does the index of item discrimination in item analysis indicate?
Signup and view all the answers
Which statistical technique is used for selecting and rejecting items based on their difficulty value and discrimination power?
Which statistical technique is used for selecting and rejecting items based on their difficulty value and discrimination power?
Signup and view all the answers
What does the Item-Reliability Index provide an indication of?
What does the Item-Reliability Index provide an indication of?
Signup and view all the answers
How is the Average Difficulty of items calculated?
How is the Average Difficulty of items calculated?
Signup and view all the answers
What is the Guessing Probability for a five-option multiple-choice item?
What is the Guessing Probability for a five-option multiple-choice item?
Signup and view all the answers
What does an Item-Difficulty Index of 0.53 suggest?
What does an Item-Difficulty Index of 0.53 suggest?
Signup and view all the answers
What is the purpose of calculating the Item-Reliability Index?
What is the purpose of calculating the Item-Reliability Index?
Signup and view all the answers
What is the primary difference between norm-referenced tests and criterion-referenced tests?
What is the primary difference between norm-referenced tests and criterion-referenced tests?
Signup and view all the answers
What is the purpose of pilot work in test construction?
What is the purpose of pilot work in test construction?
Signup and view all the answers
Who is credited with being at the forefront of developing methodologically sound scaling methods?
Who is credited with being at the forefront of developing methodologically sound scaling methods?
Signup and view all the answers
Which type of scale would be most appropriate if test-takers' performance as a function of grade is of critical interest?
Which type of scale would be most appropriate if test-takers' performance as a function of grade is of critical interest?
Signup and view all the answers
In test construction, what does scaling refer to?
In test construction, what does scaling refer to?
Signup and view all the answers
What is the central concept behind norm-referenced instruments?
What is the central concept behind norm-referenced instruments?
Signup and view all the answers
What is the purpose of think-aloud test administration?
What is the purpose of think-aloud test administration?
Signup and view all the answers
What is the role of expert panels in test development?
What is the role of expert panels in test development?
Signup and view all the answers
What is the purpose of sensitivity review in test development?
What is the purpose of sensitivity review in test development?
Signup and view all the answers
How does an expert panel contribute to modifying a test?
How does an expert panel contribute to modifying a test?
Signup and view all the answers
What is the primary function of using IRT in building and revising tests?
What is the primary function of using IRT in building and revising tests?
Signup and view all the answers
Why do researchers caution against allowing testtakers to describe a test?
Why do researchers caution against allowing testtakers to describe a test?
Signup and view all the answers
What problem does the responsible test developer address in the test manual?
What problem does the responsible test developer address in the test manual?
Signup and view all the answers
What does ITEM FAIRNESS refer to?
What does ITEM FAIRNESS refer to?
Signup and view all the answers
How can ICC and DIF be useful in identifying biased test items?
How can ICC and DIF be useful in identifying biased test items?
Signup and view all the answers
How are biased test items identified in relation to specific groups of examinees?
How are biased test items identified in relation to specific groups of examinees?
Signup and view all the answers
What is one of the key requirements for an item to be considered fair to different groups of test takers?
What is one of the key requirements for an item to be considered fair to different groups of test takers?
Signup and view all the answers
What does Differential Item Functioning (DIF) identify in specific items?
What does Differential Item Functioning (DIF) identify in specific items?
Signup and view all the answers
Study Notes
Academic Testing
- A good item is one that aligns with the expected performance of high and low scorers on the overall test.
- Item analysis is used to select and reject items based on their difficulty value and discrimination power.
Item Analysis
- Purpose: To determine if an item is too easy or too difficult, and how well it discriminates between high and low scorers.
- Tools:
- Index of item difficulty
- Index of item reliability
- Index of item validity
- Index of item discrimination
- Distractor analysis is used to determine if all alternatives functioned as intended.
Item Difficulty Index
- Calculated by the proportion of test takers who answered the item correctly.
- Also known as Item-Endorsement Index in personality testing.
- Formula: Average p = (p1 + p2 + p3 + …p) / Total Number of Items
Item Reliability Index
- Provides an indication of the internal consistency of a test.
- Calculated using: Item-Score Standard Deviation, Criterion Score, and Pearson Correlation.
- Higher index suggests better internal consistency.
Types of Tests
- Norm-referenced test: Compares test-taker's performance to peers in a norming group.
- Criterion-referenced test: Compares test-taker's performance to an objective standard.
Test Construction
- Scaling: Setting rules for assigning numbers in measurement.
- Types of scales:
- Age-based scale
- Grade-based scale
- Stanine scale
Pilot Work
- Preliminary research surrounding the creation of a prototype test.
- Determines how best to measure a targeted construct.
- A necessity for constructing tests for publication and wide distribution.
Think-Aloud Administration
- A qualitative research tool to shed light on test-taker's thought processes during test administration.
- Respondents verbalize thoughts as they occur.
Expert Panels
- Provide qualitative analyses of test items.
- Used to modify or revise the test.
Sensitivity Review
- A study of test items to ensure fairness to all prospective test-takers and absence of offensive language, stereotypes, or situations.
Test Revision
- A stage in new test development.
- Used to address issues identified during item analysis.
Item Fairness
- Refers to the degree a test item is biased.
- Many methods can be used to identify biased test items, including ICC and DIF.
Biased Test Items
- An item that favors one particular group of examinees over another when differences in group ability are controlled.
- ICC and DIF can be used to identify biased items.
Item-Characteristic Curves (ICC)
- Used to identify biased items.
- ICC for different groups should not be significantly different for an item to be considered fair.
Differential Item Functioning (DIF)
- Specific items are identified as biased in a statistical sense if they exhibit DIF.
- Used to identify biased items.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Description
Learn about the differences between norm-referenced tests that compare performance to peers, and criterion-referenced tests that compare performance to an objective standard. Explore how these tests are used in licensing and educational contexts, and understand the development of criterion-referenced instruments.