Podcast
Questions and Answers
What is a characteristic of a good item in academic testing?
What is a characteristic of a good item in academic testing?
- Answered correctly by high scorers and incorrectly by low scorers (correct)
- Answered incorrectly by both high and low scorers
- Answered correctly by both high and low scorers
- Answered incorrectly by high scorers and correctly by low scorers
What is the purpose of item analysis in academic testing?
What is the purpose of item analysis in academic testing?
- To analyze the test takers' confidence levels
- To determine item difficulty and discrimination power (correct)
- To assess the physical environment during testing
- To evaluate the test's duration
What does the item-difficulty index measure?
What does the item-difficulty index measure?
- The proportion of test takers who answered an item correctly (correct)
- The proportion of test takers who answered an item incorrectly
- The number of test takers who skipped the item
- The time taken by test takers to answer an item
What is the purpose of the Item-Endorsement Index in other contexts?
What is the purpose of the Item-Endorsement Index in other contexts?
What does the index of item discrimination in item analysis indicate?
What does the index of item discrimination in item analysis indicate?
Which statistical technique is used for selecting and rejecting items based on their difficulty value and discrimination power?
Which statistical technique is used for selecting and rejecting items based on their difficulty value and discrimination power?
What does the Item-Reliability Index provide an indication of?
What does the Item-Reliability Index provide an indication of?
How is the Average Difficulty of items calculated?
How is the Average Difficulty of items calculated?
What is the Guessing Probability for a five-option multiple-choice item?
What is the Guessing Probability for a five-option multiple-choice item?
What does an Item-Difficulty Index of 0.53 suggest?
What does an Item-Difficulty Index of 0.53 suggest?
What is the purpose of calculating the Item-Reliability Index?
What is the purpose of calculating the Item-Reliability Index?
What is the primary difference between norm-referenced tests and criterion-referenced tests?
What is the primary difference between norm-referenced tests and criterion-referenced tests?
What is the purpose of pilot work in test construction?
What is the purpose of pilot work in test construction?
Who is credited with being at the forefront of developing methodologically sound scaling methods?
Who is credited with being at the forefront of developing methodologically sound scaling methods?
Which type of scale would be most appropriate if test-takers' performance as a function of grade is of critical interest?
Which type of scale would be most appropriate if test-takers' performance as a function of grade is of critical interest?
In test construction, what does scaling refer to?
In test construction, what does scaling refer to?
What is the central concept behind norm-referenced instruments?
What is the central concept behind norm-referenced instruments?
What is the purpose of think-aloud test administration?
What is the purpose of think-aloud test administration?
What is the role of expert panels in test development?
What is the role of expert panels in test development?
What is the purpose of sensitivity review in test development?
What is the purpose of sensitivity review in test development?
How does an expert panel contribute to modifying a test?
How does an expert panel contribute to modifying a test?
What is the primary function of using IRT in building and revising tests?
What is the primary function of using IRT in building and revising tests?
Why do researchers caution against allowing testtakers to describe a test?
Why do researchers caution against allowing testtakers to describe a test?
What problem does the responsible test developer address in the test manual?
What problem does the responsible test developer address in the test manual?
What does ITEM FAIRNESS refer to?
What does ITEM FAIRNESS refer to?
How can ICC and DIF be useful in identifying biased test items?
How can ICC and DIF be useful in identifying biased test items?
How are biased test items identified in relation to specific groups of examinees?
How are biased test items identified in relation to specific groups of examinees?
What is one of the key requirements for an item to be considered fair to different groups of test takers?
What is one of the key requirements for an item to be considered fair to different groups of test takers?
What does Differential Item Functioning (DIF) identify in specific items?
What does Differential Item Functioning (DIF) identify in specific items?
Flashcards are hidden until you start studying
Study Notes
Academic Testing
- A good item is one that aligns with the expected performance of high and low scorers on the overall test.
- Item analysis is used to select and reject items based on their difficulty value and discrimination power.
Item Analysis
- Purpose: To determine if an item is too easy or too difficult, and how well it discriminates between high and low scorers.
- Tools:
- Index of item difficulty
- Index of item reliability
- Index of item validity
- Index of item discrimination
- Distractor analysis is used to determine if all alternatives functioned as intended.
Item Difficulty Index
- Calculated by the proportion of test takers who answered the item correctly.
- Also known as Item-Endorsement Index in personality testing.
- Formula: Average p = (p1 + p2 + p3 + …p) / Total Number of Items
Item Reliability Index
- Provides an indication of the internal consistency of a test.
- Calculated using: Item-Score Standard Deviation, Criterion Score, and Pearson Correlation.
- Higher index suggests better internal consistency.
Types of Tests
- Norm-referenced test: Compares test-taker's performance to peers in a norming group.
- Criterion-referenced test: Compares test-taker's performance to an objective standard.
Test Construction
- Scaling: Setting rules for assigning numbers in measurement.
- Types of scales:
- Age-based scale
- Grade-based scale
- Stanine scale
Pilot Work
- Preliminary research surrounding the creation of a prototype test.
- Determines how best to measure a targeted construct.
- A necessity for constructing tests for publication and wide distribution.
Think-Aloud Administration
- A qualitative research tool to shed light on test-taker's thought processes during test administration.
- Respondents verbalize thoughts as they occur.
Expert Panels
- Provide qualitative analyses of test items.
- Used to modify or revise the test.
Sensitivity Review
- A study of test items to ensure fairness to all prospective test-takers and absence of offensive language, stereotypes, or situations.
Test Revision
- A stage in new test development.
- Used to address issues identified during item analysis.
Item Fairness
- Refers to the degree a test item is biased.
- Many methods can be used to identify biased test items, including ICC and DIF.
Biased Test Items
- An item that favors one particular group of examinees over another when differences in group ability are controlled.
- ICC and DIF can be used to identify biased items.
Item-Characteristic Curves (ICC)
- Used to identify biased items.
- ICC for different groups should not be significantly different for an item to be considered fair.
Differential Item Functioning (DIF)
- Specific items are identified as biased in a statistical sense if they exhibit DIF.
- Used to identify biased items.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.