Norm-Referenced vs Criterion-Referenced Test Quiz

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is a characteristic of a good item in academic testing?

Answered correctly by high scorers and incorrectly by low scorers (correct)
Answered incorrectly by both high and low scorers
Answered correctly by both high and low scorers
Answered incorrectly by high scorers and correctly by low scorers

What is the purpose of item analysis in academic testing?

To analyze the test takers' confidence levels
To determine item difficulty and discrimination power (correct)
To assess the physical environment during testing
To evaluate the test's duration

What does the item-difficulty index measure?

The proportion of test takers who answered an item correctly (correct)
The proportion of test takers who answered an item incorrectly
The number of test takers who skipped the item
The time taken by test takers to answer an item

What is the purpose of the Item-Endorsement Index in other contexts?

To measure item reliability (A) Signup and view all the answers

What does the index of item discrimination in item analysis indicate?

How well an item differentiates between high and low scorers (A) Signup and view all the answers

Which statistical technique is used for selecting and rejecting items based on their difficulty value and discrimination power?

Item Analysis (C) Signup and view all the answers

What does the Item-Reliability Index provide an indication of?

Internal consistency of a test (C) Signup and view all the answers

How is the Average Difficulty of items calculated?

By taking the average of item difficulties (A) Signup and view all the answers

What is the Guessing Probability for a five-option multiple-choice item?

0.20 (A) Signup and view all the answers

What does an Item-Difficulty Index of 0.53 suggest?

High internal consistency (A) Signup and view all the answers

What is the purpose of calculating the Item-Reliability Index?

To measure internal consistency of a test (B) Signup and view all the answers

What is the primary difference between norm-referenced tests and criterion-referenced tests?

Norm-referenced tests compare performance to peers, while criterion-referenced tests compare performance to an objective standard. (A) Signup and view all the answers

What is the purpose of pilot work in test construction?

To determine how best to measure a targeted construct. (B) Signup and view all the answers

Who is credited with being at the forefront of developing methodologically sound scaling methods?

L.L. Thurstone (C) Signup and view all the answers

Which type of scale would be most appropriate if test-takers' performance as a function of grade is of critical interest?

Grade-based scale (C) Signup and view all the answers

In test construction, what does scaling refer to?

Setting rules for assigning numbers in measurement (A) Signup and view all the answers

What is the central concept behind norm-referenced instruments?

Comparing test-takers' performance to peers in a norming group (A) Signup and view all the answers

What is the purpose of think-aloud test administration?

To shed light on the testtaker's thought processes during test administration (D) Signup and view all the answers

What is the role of expert panels in test development?

To assess the fairness of test items (B) Signup and view all the answers

What is the purpose of sensitivity review in test development?

To examine the presence of offensive language or stereotypes in tests (B) Signup and view all the answers

How does an expert panel contribute to modifying a test?

By providing qualitative analyses of test items (B) Signup and view all the answers

What is the primary function of using IRT in building and revising tests?

Ensuring accurate measurement of testtaker abilities (C) Signup and view all the answers

Why do researchers caution against allowing testtakers to describe a test?

Because it is similar to allowing students to describe their instructors' problems (A) Signup and view all the answers

What problem does the responsible test developer address in the test manual?

The problem of guessing by test takers (D) Signup and view all the answers

What does ITEM FAIRNESS refer to?

The extent to which a test item is unbiased (D) Signup and view all the answers

How can ICC and DIF be useful in identifying biased test items?

By analyzing the graphs and functioning of each item relative to the test takers' abilities and backgrounds (B) Signup and view all the answers

How are biased test items identified in relation to specific groups of examinees?

By analyzing group abilities when differences are controlled (D) Signup and view all the answers

What is one of the key requirements for an item to be considered fair to different groups of test takers?

The item should have similar ICC for different groups (B) Signup and view all the answers

What does Differential Item Functioning (DIF) identify in specific items?

Biases exhibited by certain items in a statistical sense (A) Signup and view all the answers

Flashcards are hidden until you start studying

Study Notes

Academic Testing

A good item is one that aligns with the expected performance of high and low scorers on the overall test.
Item analysis is used to select and reject items based on their difficulty value and discrimination power.

Item Analysis

Purpose: To determine if an item is too easy or too difficult, and how well it discriminates between high and low scorers.
Tools:
- Index of item difficulty
- Index of item reliability
- Index of item validity
- Index of item discrimination
Distractor analysis is used to determine if all alternatives functioned as intended.

Item Difficulty Index

Calculated by the proportion of test takers who answered the item correctly.
Also known as Item-Endorsement Index in personality testing.
Formula: Average p = (p1 + p2 + p3 + …p) / Total Number of Items

Item Reliability Index

Provides an indication of the internal consistency of a test.
Calculated using: Item-Score Standard Deviation, Criterion Score, and Pearson Correlation.
Higher index suggests better internal consistency.

Types of Tests

Norm-referenced test: Compares test-taker's performance to peers in a norming group.
Criterion-referenced test: Compares test-taker's performance to an objective standard.

Test Construction

Scaling: Setting rules for assigning numbers in measurement.
Types of scales:
- Age-based scale
- Grade-based scale
- Stanine scale

Pilot Work

Preliminary research surrounding the creation of a prototype test.
Determines how best to measure a targeted construct.
A necessity for constructing tests for publication and wide distribution.

Think-Aloud Administration

A qualitative research tool to shed light on test-taker's thought processes during test administration.
Respondents verbalize thoughts as they occur.

Expert Panels

Provide qualitative analyses of test items.
Used to modify or revise the test.

Sensitivity Review

A study of test items to ensure fairness to all prospective test-takers and absence of offensive language, stereotypes, or situations.

Test Revision

A stage in new test development.
Used to address issues identified during item analysis.

Item Fairness

Refers to the degree a test item is biased.
Many methods can be used to identify biased test items, including ICC and DIF.

Biased Test Items

An item that favors one particular group of examinees over another when differences in group ability are controlled.
ICC and DIF can be used to identify biased items.

Item-Characteristic Curves (ICC)

Used to identify biased items.
ICC for different groups should not be significantly different for an item to be considered fair.

Differential Item Functioning (DIF)

Specific items are identified as biased in a statistical sense if they exhibit DIF.
Used to identify biased items.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Norm-Referenced Tests vs Criterion-Referenced Tests

Choose a study mode