Podcast
Questions and Answers
Which of the following is NOT a suggested guideline for writing effective test items?
Which of the following is NOT a suggested guideline for writing effective test items?
- Define clearly what you wish to measure to maintain focus.
- Be aware of the reading level of those taking the scale and the reading level of the items.
- Generate a small pool of items to ensure each one is highly refined. (correct)
- Consider using questions that mix positive and negative wording.
Which of the following is an advantage of using the dichotomous format in psychological testing?
Which of the following is an advantage of using the dichotomous format in psychological testing?
- It discourages memorization and rote learning.
- It is simple and often requires absolute judgment. (correct)
- It allows for nuanced responses, capturing the complexity of human traits.
- It accurately reflects the true complexity of most situations.
In a multiple-choice test, what are the incorrect answer options commonly referred to as?
In a multiple-choice test, what are the incorrect answer options commonly referred to as?
- Keys
- Stems
- Anchors
- Distractors (correct)
What is a primary risk associated with using an 'unfocused stem' in a test question?
What is a primary risk associated with using an 'unfocused stem' in a test question?
What is the MOST important consideration when deciding whether to guess on a multiple-choice question?
What is the MOST important consideration when deciding whether to guess on a multiple-choice question?
Which type of measurement is the Likert format particularly well-suited for?
Which type of measurement is the Likert format particularly well-suited for?
What is a potential drawback of using a 10-point category scale to rate the abilities of a group?
What is a potential drawback of using a 10-point category scale to rate the abilities of a group?
What is the primary purpose of Q-sorts?
What is the primary purpose of Q-sorts?
Item analysis is used to evaluate test items. What does item difficulty specifically assess?
Item analysis is used to evaluate test items. What does item difficulty specifically assess?
In item analysis, what does a difficulty index (D.V) of 0.80 indicate?
In item analysis, what does a difficulty index (D.V) of 0.80 indicate?
What does it mean if an item has a negative discrimination index?
What does it mean if an item has a negative discrimination index?
What is the theoretical range for the index of discrimination?
What is the theoretical range for the index of discrimination?
According to the criteria for selection and rejection of test items, which items should be selected?
According to the criteria for selection and rejection of test items, which items should be selected?
What does an item characteristic curve graph?
What does an item characteristic curve graph?
What is a key advantage of item response theory over classical test theory?
What is a key advantage of item response theory over classical test theory?
What is the purpose of linking uncommon measures in testing?
What is the purpose of linking uncommon measures in testing?
What initial step is required when creating items for criterion-referenced tests?
What initial step is required when creating items for criterion-referenced tests?
What is a primary limitation of item analysis?
What is a primary limitation of item analysis?
How can the relationship between the examiner and test taker affect test scores?
How can the relationship between the examiner and test taker affect test scores?
What did research by Feldman and Sullivan (1960) find regarding rapport and test scores on the WISC?
What did research by Feldman and Sullivan (1960) find regarding rapport and test scores on the WISC?
What consideration is related to the race of the tester when administering psychological tests?
What consideration is related to the race of the tester when administering psychological tests?
What is stereotype threat?
What is stereotype threat?
According to the information, what BEST describes how stereotype threat can negatively impact performance?
According to the information, what BEST describes how stereotype threat can negatively impact performance?
In the context of psychological testing, why is the language of the test taker important?
In the context of psychological testing, why is the language of the test taker important?
What does telling the test taker the test is nondiagnostic do?
What does telling the test taker the test is nondiagnostic do?
What do tests require?
What do tests require?
Data can be impacted by what an experiment expects to find. What is this called?
Data can be impacted by what an experiment expects to find. What is this called?
If your teacher looks at your answers during a test and smiles and nods, what impact do you think this has?
If your teacher looks at your answers during a test and smiles and nods, what impact do you think this has?
Which of the following is an advantage of Computer-Assisted Test Administration?
Which of the following is an advantage of Computer-Assisted Test Administration?
What are some problems one can find that influences test performance?
What are some problems one can find that influences test performance?
Motivation and anxiety can influence test performance. What are some components of this?
Motivation and anxiety can influence test performance. What are some components of this?
Which of the following is the MOST accurate description of the 'extreme group method' of the discriminability?
Which of the following is the MOST accurate description of the 'extreme group method' of the discriminability?
To increase the accuracy of tests and also reduce the volume of responses, what can be applied?
To increase the accuracy of tests and also reduce the volume of responses, what can be applied?
Which of the following is a problem with using a dichotomous format?
Which of the following is a problem with using a dichotomous format?
Which of the following is NOT a noted problem with using a negative stem?
Which of the following is NOT a noted problem with using a negative stem?
When can expectancies occur?
When can expectancies occur?
Which of the following is NOT considered one of the advantages of computer-assisted test administration?
Which of the following is NOT considered one of the advantages of computer-assisted test administration?
Which of the following is a component of the state of the subject influencing test performance?
Which of the following is a component of the state of the subject influencing test performance?
Statistical methods to create appropriate comparisons can be…
Statistical methods to create appropriate comparisons can be…
Flashcards
Dichotomous Format
Dichotomous Format
A format with two answer choices for each question.
Polytomous Format
Polytomous Format
A format with more than two options, like multiple-choice questions.
Distractors
Distractors
Incorrect options in a multiple-choice question.
Likert Format
Likert Format
Signup and view all the flashcards
Category Format
Category Format
Signup and view all the flashcards
Adjective Checklist
Adjective Checklist
Signup and view all the flashcards
Q-Sorts
Q-Sorts
Signup and view all the flashcards
Item Analysis
Item Analysis
Signup and view all the flashcards
Item Difficulty
Item Difficulty
Signup and view all the flashcards
Discriminability
Discriminability
Signup and view all the flashcards
Extreme Group Method
Extreme Group Method
Signup and view all the flashcards
Point Biserial Method
Point Biserial Method
Signup and view all the flashcards
Item Characteristic Curve
Item Characteristic Curve
Signup and view all the flashcards
Item Response Theory(IRT)
Item Response Theory(IRT)
Signup and view all the flashcards
Criterion-Referenced Test
Criterion-Referenced Test
Signup and view all the flashcards
Examiner Relationship
Examiner Relationship
Signup and view all the flashcards
Stereotype Threat
Stereotype Threat
Signup and view all the flashcards
Expectancy Effects
Expectancy Effects
Signup and view all the flashcards
Reinforcing Responses
Reinforcing Responses
Signup and view all the flashcards
Computer-Assisted Testing
Computer-Assisted Testing
Signup and view all the flashcards
Subject variables
Subject variables
Signup and view all the flashcards
Study Notes
Chapter 4: Writing, Evaluation Test Items, and Test Administration
- This chapter covers writing and evaluating test items, and administering tests.
- After completing the chapter, the user should be able to describe the 2 types of item formats, and understand whether or not they should guess with multiple choice questions
Item Writing - Guidelines and Diversity
- Six guidelines for writing test items (DeVellis, 2016) include defining what to measure, generating a pool of items, and avoiding exceptionally long items.
- In addition it is important to; be aware of participants reading level, avoid items that convey 2 or more ideas, and consider questions that mix positive and negative wording.
- It is also important to be aware of diversity issues.
Item Formats
- The Dichotomous Format offers two choices for each question, such as yes/no or true/false.
- Examples of the Dichotomous Format can be found on educational tests, as well as personality tests.
- Advantages of the Dichotomous Format are simplicity, and often requires absolute judgment.
- Disadvantages of the Dichotomous Format include that many situations are not truly dichotomous, and that it can promote memorization rather than understanding.
- The Polytomous Format is similar to the Dichotomous Format but has more than two options.
- The most common example of the Polytomous Format is the multiple-choice test, where there is one right answer and several wrong answers.
- Incorrect answers on the Polytomous Format are referred to as distractor.
Problems in items include
- Unfocused stems that do not includes the necessary information.
- Negative stems that include negative terms like "not" or "except".
- Window Dressing, stems with information that is irrelevant to the question, or concept
- Unequal Option Length, the correct answer and distractors vary in length.
- Negative Options should exclude negatives such as not.
- Clues are a problem where sometimes clues are provided that indicate an answer, avoid using vague terms like may, can, and might.
- Heterogeneous options where, the correct option and all distractors are not in the same general category.
Guessing
- A certain number of responses on a limited item test, can be answered without knowledge
- There is a formula that can correct for the amount of guessed items
- The advantage of guessing depends on whether incorrect answered have are penalized, or score no credits.
Item scaling
- Likert Format is pronounced "Lick-ert," not “Like-ert."
- The Likert Format offers a continuum of responses for measuring attitudes on topics.
- Likert scaling is open to factor analysis and groups of items can be grouped together and identified.
- Category format is similar to that of Likert, but has a greater number of choices
- Visual analogue scales are also used
Checklists and Q-Sorts
- Adjective checklists are lists of terms where an individual selects those most characteristic of themself
- Q-Sorts are lists of adjectives sorted into nine piles of increasing similarity to a target person.
- Checklists have fallen out of favor and forced choice and Likert formats are more popular.
- Very important advice is to avoid "all of the above" or "none of the above" options.
Item Analysis and Difficulty
- Item analysis refers to a general set of methods used to evaluate test items.
- Item difficulty is an important evaluation component that questions what percent of people chose the correct item.
- The number of answers is factor that determine a reasonable difficulty level.
- Difficulty of .30 to .70 is optimal for differentiation between individuals
- Difficulty Index (DV) = Number of students with the correct xanswer / Total number of student
- Low difficulty value index means an item has a high difficulty, while high difficulty value index means that an item is easy.
Discriminability
- Determines if people who have done well on a particular item have also done well on the entire test.
- Types of discrimination methods include the Extreme Group and Point Biserial methods.
- Discrimination Positive discrimination index items should be selected, and negative/zero discrimination index should be rejected.
- The end of the item analysis report, test items are listed according to degrees of difficulty and discrimination.
- Discrimination index is the a measurements that helps determine the ability of an test item
- Discrimination index = (number of students with the correct answer in the upper group / total number of students in the upper group) - (number of students with the correct answer in the lower group / total number of students in the lower group)
- Ideally, the index of discrimination ranges from -1.0 to 1.0.
- Types of discrimination indexes includes zero or no discrinimation, or positive/negative discrimination
Pictures of Item Characteristics and Item Response Theory
- Drawing the Item Characteristic Curve graphs items right and total score
- Often uses categories rather than data points
- Visuals can indicate weak/strong visuals
- The Item response theory assesses test quality by getting whether an item right or wrong
- Advantage of the Item response theory: looks at not as many of the number of answers but the level of difficulty.
Other Information
- Linking uncommon measures involves determining how to correct tests without the same items.
- An example is the SAT which utilizes different items however has the same scoring
- Statistical methods are used to create comparisons.
- Criterion-referenced tests compare learning for testing by following a few steps
- Specify the objectives for the assessment and what the learning program attempts to achieve
- Test analysis can tell us the quality of a test, and not help understand the material.
Test Administration
- By the end of this part of the chapter, the user should be able to discuss how the relationship between the examiner and the test taker can affect test scores.
- In addition, explain how an effect might affect a test score and outline the advantages of computerized testing
The Relationship Between Examiner and Test Taker
- The researcher needs to understand how the behavior of the administrators affects the answers through cues
- Stronger rapports can indicate stronger scores in some cases.
Tester Race and Stereotypes
- Research is inconclusive on the extent the way test takers of different subject and the same race are affected.
- Tests should be done according to procedures that standardize administration
- The anxiety over those is related to the pressure to disconfirm negative stereotypes
- There can be stereotype pressure to impact the performance of the testers
- An experiment has showed that there is indication when a test is non diagnostic
Other Factors
- Language and cultural differences can also put some people at a disadvantage when testing.
- Standardized administration is important for test validity.
- It should also be noted that, standardized test administrations show no standard for demonstrations
Expectancy Effects
- "Rosenthal Effects," data is impacted by what an administrator expects or believes
- Expectancy effects can be unintentional/unconscious
- There are varied opinions on how reactions given to test subjects can effect future answers.
Computer Administrations
- Computer based tasks are often more beneficial for test administration including High High standardization, tailored sequential administration, precision of timing responses , and releasing more human testers
- Internet tests have explored recently.
- It is also important to consider the subject variables when administering test.
- The impact is dependent on the setting in question of the study and if illness, personal issues or personal background affect performance.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.