Podcast
Questions and Answers
What does the 'Known Groups Method' involve in determining cutoff scores?
What does the 'Known Groups Method' involve in determining cutoff scores?
Which reliability coefficient value denotes 'excellent' reliability?
Which reliability coefficient value denotes 'excellent' reliability?
In IRT-Based Methods, how are cut scores typically set?
In IRT-Based Methods, how are cut scores typically set?
What is the purpose of Discriminant Analysis as mentioned in the document?
What is the purpose of Discriminant Analysis as mentioned in the document?
Signup and view all the answers
Which method requires expert judges to discuss the issues involved in determining a pass mark?
Which method requires expert judges to discuss the issues involved in determining a pass mark?
Signup and view all the answers
What level of difficulty corresponds to an item difficulty range of 0.0 to 0.19?
What level of difficulty corresponds to an item difficulty range of 0.0 to 0.19?
Signup and view all the answers
Which p-value range signifies strong evidence against the null hypothesis?
Which p-value range signifies strong evidence against the null hypothesis?
Signup and view all the answers
What Cronbach's alpha value corresponds to 'Good' internal consistency?
What Cronbach's alpha value corresponds to 'Good' internal consistency?
Signup and view all the answers
Which item discrimination range is considered 'Fair'?
Which item discrimination range is considered 'Fair'?
Signup and view all the answers
Which interrater reliability coefficient range is regarded as 'Substantial' according to Landis & Koch (1977)?
Which interrater reliability coefficient range is regarded as 'Substantial' according to Landis & Koch (1977)?
Signup and view all the answers
What measure of central tendency is used when there is an unknown or undetermined score?
What measure of central tendency is used when there is an unknown or undetermined score?
Signup and view all the answers
Which interrater reliability coefficient range is classified as 'Fair to Good' according to Fleiss (1981)?
Which interrater reliability coefficient range is classified as 'Fair to Good' according to Fleiss (1981)?
Signup and view all the answers
What type of evidence against the null hypothesis does a p-value greater than 0.10 provide?
What type of evidence against the null hypothesis does a p-value greater than 0.10 provide?
Signup and view all the answers
Which measure gives an indication of the shape of the distribution as well as a measure of central tendency?
Which measure gives an indication of the shape of the distribution as well as a measure of central tendency?
Signup and view all the answers
Which measure of spread is calculated as the difference between the highest and lowest scores?
Which measure of spread is calculated as the difference between the highest and lowest scores?
Signup and view all the answers
What does the variance measure in a distribution?
What does the variance measure in a distribution?
Signup and view all the answers
How is the semi-quartile range calculated?
How is the semi-quartile range calculated?
Signup and view all the answers
Which of the following is true about percentiles?
Which of the following is true about percentiles?
Signup and view all the answers
Which measure divides a distribution into four equal parts?
Which measure divides a distribution into four equal parts?
Signup and view all the answers
What is Pearson's R used to measure?
What is Pearson's R used to measure?
Signup and view all the answers
Which measure is NOT commonly used for nominal scales or discrete variables?
Which measure is NOT commonly used for nominal scales or discrete variables?
Signup and view all the answers
Which test should be used to assess normality for a sample size of 60?
Which test should be used to assess normality for a sample size of 60?
Signup and view all the answers
Which statistical test is appropriate for comparing more than two groups on an ordinal scale?
Which statistical test is appropriate for comparing more than two groups on an ordinal scale?
Signup and view all the answers
What is the major sensitivity difference between Bartlett's Test and Levene's Test?
What is the major sensitivity difference between Bartlett's Test and Levene's Test?
Signup and view all the answers
What does a P-value greater than 0.05 in Levene's Test indicate?
What does a P-value greater than 0.05 in Levene's Test indicate?
Signup and view all the answers
Which of the following is an example of a true dichotomy?
Which of the following is an example of a true dichotomy?
Signup and view all the answers
During which stage of test development is the construct determined?
During which stage of test development is the construct determined?
Signup and view all the answers
What does computerized adaptive testing primarily depend on?
What does computerized adaptive testing primarily depend on?
Signup and view all the answers
Which test measures dependent means on an ordinal scale?
Which test measures dependent means on an ordinal scale?
Signup and view all the answers
What characterizes the primary role of an item pool in test construction?
What characterizes the primary role of an item pool in test construction?
Signup and view all the answers
Which stage involves item revising, formatting, and setting scoring rules?
Which stage involves item revising, formatting, and setting scoring rules?
Signup and view all the answers
Which method is used to compare individuals who have performed well against those who have not on a test?
Which method is used to compare individuals who have performed well against those who have not on a test?
Signup and view all the answers
What is defined by the proportion of test takers who answered an item correctly in personality testing?
What is defined by the proportion of test takers who answered an item correctly in personality testing?
Signup and view all the answers
Which index measures the internal consistency of a test?
Which index measures the internal consistency of a test?
Signup and view all the answers
What is the optimal average item difficulty proportion for a test?
What is the optimal average item difficulty proportion for a test?
Signup and view all the answers
Which of the following indicates a 'Very Good Item' based on the Point-Biserial Method?
Which of the following indicates a 'Very Good Item' based on the Point-Biserial Method?
Signup and view all the answers
Which statistical procedure is used to evaluate test items?
Which statistical procedure is used to evaluate test items?
Signup and view all the answers
What type of items should be avoided during item writing?
What type of items should be avoided during item writing?
Signup and view all the answers
How are items arranged in an Omnibus Spiral Format?
How are items arranged in an Omnibus Spiral Format?
Signup and view all the answers
Which index measures the degree to which a test measures what it purports to measure?
Which index measures the degree to which a test measures what it purports to measure?
Signup and view all the answers
Phantom factors may emerge as a risk of using what in psychological assessment?
Phantom factors may emerge as a risk of using what in psychological assessment?
Signup and view all the answers
Which statistical test would you use for measuring correlation between two variables when both are measured on a nominal scale?
Which statistical test would you use for measuring correlation between two variables when both are measured on a nominal scale?
Signup and view all the answers
Which test is appropriate for examining the difference between the means of multiple dependent variables across two or more independent groups?
Which test is appropriate for examining the difference between the means of multiple dependent variables across two or more independent groups?
Signup and view all the answers
To predict the unknown value of variable X using the known value of variable Y, which test should be used?
To predict the unknown value of variable X using the known value of variable Y, which test should be used?
Signup and view all the answers
Which non-parametric test is equivalent to a paired t-test?
Which non-parametric test is equivalent to a paired t-test?
Signup and view all the answers
Which test would you use to control for an additional variable that may be influencing the relationship between your independent and dependent variable?
Which test would you use to control for an additional variable that may be influencing the relationship between your independent and dependent variable?
Signup and view all the answers
Which test should be used to analyze the focus level of a group of reviewers measured in the morning, afternoon, and night sessions of review?
Which test should be used to analyze the focus level of a group of reviewers measured in the morning, afternoon, and night sessions of review?
Signup and view all the answers
Which test involves artificial dichotomous variables for both the independent and dependent variables?
Which test involves artificial dichotomous variables for both the independent and dependent variables?
Signup and view all the answers
Which statistical test should be used to test the difference between groups when you have nominal data involving two groups with two or more categories?
Which statistical test should be used to test the difference between groups when you have nominal data involving two groups with two or more categories?
Signup and view all the answers
Which test would you use to measure the difference in blood pressure of a group before and after a lecture?
Which test would you use to measure the difference in blood pressure of a group before and after a lecture?
Signup and view all the answers
Which test should be used when comparing blood pressure measurements of young adults, middle-aged adults, and old adults during breakfast, lunch, and dinner?
Which test should be used when comparing blood pressure measurements of young adults, middle-aged adults, and old adults during breakfast, lunch, and dinner?
Signup and view all the answers
What is the primary purpose of cross-validation in test revision?
What is the primary purpose of cross-validation in test revision?
Signup and view all the answers
Which type of scoring discrepancy occurs when there is a difference between scoring in an anchor protocol and another protocol?
Which type of scoring discrepancy occurs when there is a difference between scoring in an anchor protocol and another protocol?
Signup and view all the answers
What does DIF Analysis aim to detect during test development?
What does DIF Analysis aim to detect during test development?
Signup and view all the answers
Which aspect of computerized adaptive testing reduces the likelihood of testtakers having low or high extreme scores?
Which aspect of computerized adaptive testing reduces the likelihood of testtakers having low or high extreme scores?
Signup and view all the answers
In the context of inferential statistics, what is the main purpose?
In the context of inferential statistics, what is the main purpose?
Signup and view all the answers
What is the role of an anchor protocol in test scoring?
What is the role of an anchor protocol in test scoring?
Signup and view all the answers
Which term describes the inevitable decrease in item validities after cross-validation?
Which term describes the inevitable decrease in item validities after cross-validation?
Signup and view all the answers
What is indicated by the term 'Equal Intervals' in measurement scales?
What is indicated by the term 'Equal Intervals' in measurement scales?
Signup and view all the answers
What is the primary function of an item-mapping method in test construction?
What is the primary function of an item-mapping method in test construction?
Signup and view all the answers
Ipsative scoring is used to compare what aspects within a test?
Ipsative scoring is used to compare what aspects within a test?
Signup and view all the answers
Which measure of central tendency is most useful when analyzing a skewed distribution?
Which measure of central tendency is most useful when analyzing a skewed distribution?
Signup and view all the answers
What distinguishes a ratio scale from an interval scale?
What distinguishes a ratio scale from an interval scale?
Signup and view all the answers
In the context of psychological assessment, what does 'error' primarily refer to?
In the context of psychological assessment, what does 'error' primarily refer to?
Signup and view all the answers
Which level of measurement is appropriate for categorizing observations without any quantitative distinctions?
Which level of measurement is appropriate for categorizing observations without any quantitative distinctions?
Signup and view all the answers
Which central tendency measure is appropriate for nominal data?
Which central tendency measure is appropriate for nominal data?
Signup and view all the answers
What is the goal of measures of central tendency in a distribution?
What is the goal of measures of central tendency in a distribution?
Signup and view all the answers
Which post-hoc test is used to determine the minimum difference between treatment means necessary for significance in ANOVA?
Which post-hoc test is used to determine the minimum difference between treatment means necessary for significance in ANOVA?
Signup and view all the answers
Why might the range be an unreliable measure of variability?
Why might the range be an unreliable measure of variability?
Signup and view all the answers
What statistical measure provides a quick but gross description of the spread of scores?
What statistical measure provides a quick but gross description of the spread of scores?
Signup and view all the answers
Which level of measurement allows for rank ordering on some characteristic but does not have equal intervals?
Which level of measurement allows for rank ordering on some characteristic but does not have equal intervals?
Signup and view all the answers
In a positively skewed distribution, where do most of the scores fall?
In a positively skewed distribution, where do most of the scores fall?
Signup and view all the answers
Which type of distribution has its mean equal to its median and mode?
Which type of distribution has its mean equal to its median and mode?
Signup and view all the answers
What kurtosis describes a distribution with a relatively flat peak?
What kurtosis describes a distribution with a relatively flat peak?
Signup and view all the answers
What type of standard score scale is set at a mean of 50 with a standard deviation of 10?
What type of standard score scale is set at a mean of 50 with a standard deviation of 10?
Signup and view all the answers
Which condition describes a distribution with high kurtosis?
Which condition describes a distribution with high kurtosis?
Signup and view all the answers
What does a Z-score indicate in a distribution?
What does a Z-score indicate in a distribution?
Signup and view all the answers
Which of the following distributions would most likely represent an easy exam?
Which of the following distributions would most likely represent an easy exam?
Signup and view all the answers
When a distribution has the mean < median < mode, what is typically observed?
When a distribution has the mean < median < mode, what is typically observed?
Signup and view all the answers
Which type of item format requires test takers to supply or create the correct answer?
Which type of item format requires test takers to supply or create the correct answer?
Signup and view all the answers
Which scale of measurement is characterized by having true zero points?
Which scale of measurement is characterized by having true zero points?
Signup and view all the answers
What distinguishes a good distractor in a multiple-choice item?
What distinguishes a good distractor in a multiple-choice item?
Signup and view all the answers
What is the significant characteristic of the ratio scale of measurement?
What is the significant characteristic of the ratio scale of measurement?
Signup and view all the answers
Which item type involves respondents ranking objects based on a criterion?
Which item type involves respondents ranking objects based on a criterion?
Signup and view all the answers
What type of measurement involves categorization without quantitative distinctions?
What type of measurement involves categorization without quantitative distinctions?
Signup and view all the answers
What is a characteristic of completion items in constructed-response formats?
What is a characteristic of completion items in constructed-response formats?
Signup and view all the answers
Which comparative scale of measurement involves allocating a constant sum of units among a set of items?
Which comparative scale of measurement involves allocating a constant sum of units among a set of items?
Signup and view all the answers
Which of the following best describes an ineffective distractor?
Which of the following best describes an ineffective distractor?
Signup and view all the answers
Which characteristic describes the interval scale of measurement?
Which characteristic describes the interval scale of measurement?
Signup and view all the answers
Study Notes
BLEPP
Source
- Cohen & Swerdlik (2018)
- Kaplan & Saccuzzo (2018)
- Groth & Wright (2016)
- Psych Pearls
Item Writing Guidelines
- Define what to measure
- Generate item pool
- Avoid long items
- Keep reading difficulty appropriate
- Avoid double-barreled items
- Consider positive and negative worded items
Item Difficulty
- Defined by the number of people who get a particular item correct
- Item-Difficulty Index: proportion of total test-takers who answered the item correctly
- Optimal average item difficulty is approximately 50% with items ranging from 30% to 80%
Item Difficulty Ranges
- 0.0-0.19: Very difficult
- 0.20-0.39: Difficult
- 0.40-0.60: Average/moderately difficult
- 0.61-0.79: Easy
- 0.80-1.0: Very easy
Item Reliability Index
- Provides an indication of the internal consistency of a test
- Higher Item-Reliability index, the greater the test's internal consistency
Item-Validity Index
- Designed to provide an indication of the degree to which a test measures what it purports to measure
- Higher Item-Validity index, the greater the test's criterion-related validity
Item-Discrimination Index
- Measure of item discrimination
- Difference between proportion of high scorers answering an item correctly and proportion of low scorers answering the item correctly
Extreme Group Method
- Compares people who have done well with those who have done poorly
- Point-Biserial Method: correlation between a dichotomous variable and a continuous variable
Item-Characteristic Curve
- Graphic representation of item difficulty and discrimination
Guessing
- One that eluded any universally accepted solutions
- Item analyses taken under speed conditions yield misleading or uninterpretable results
Effective Distractors
- A distractor that was chosen equally by both high and low performing groups that enhances the consistency of test results
- Good distractors have been chosen frequently by low scorers
Ineffective Distractors
- May hurt the reliability of the test because they are time-consuming to read and can limit the number of good items
Types of Items
- Matching Item
- Binary Choice
- Constructed-Response Format
- Completion Item
- Short-Answer
- Essay
Primary Scales of Measurement
- Nominal: involve classification or categorization
- Mode
- Ordinal: rank ordering
- Median
- Ratio: contains equal intervals, has no absolute zero point
- Interval: has true zero point, easiest to manipulate
Comparative Scales of Measurement
- Paired Comparison: produces ordinal data by presenting pairs of two stimuli
- Rank Order: respondents are presented with several items simultaneously and asked to rank them in order or priority
- Constant Sum: respondents are asked to allocate a constant sum of units among a set of stimulus objects
- Q-Sort Technique: sort objects based on similarity with respect to some criterion
Non-Comparative Scales of Measurement
- Continuous Rating: rate objects by placing a mark at the appropriate position on a continuous line
- Itemized Rating: having numbers or brief descriptions associated with each category
- Likert Scale: indicate attitudes by responding to a series of statements that range from very positive to very negative
Ipsative Scoring
- Compares test-taker's score on one scale within a test to another scale within that same test, two unrelated constructs### Test Revision
- Characterize each item according to its strength and weaknesses
- Large item pool is advantageous in test revision as some items are removed and replaced by items from the pool
- Administer the revised test under standardized conditions to a second appropriate sample of examinees
- Cross-validation involves revalidating a test on a sample of test-takers other than those on whom test performance was originally found to be a valid predictor of some criterion
- Validity shrinkage is the decrease in item validities that inevitably occurs after cross-validation
Computerized Adaptive Testing
- An interactive, computer-administered test-taking process where items presented to the test-taker are based on their performance on previous items
- Reduces floor and ceiling effects
- Floor effects occur when there is a lower limit on a survey or questionnaire and a large percentage of respondents score near this lower limit
- Ceiling effects occur when there is an upper limit on a survey or questionnaire and a large percentage of respondents score near this upper limit
- Item branching is the ability of the computer to tailor the content and order of presentation of items based on responses to previous items
- Routing test is a subtest used to direct or route the test-taker to a suitable level of items
Statistics
- Measurement is the act of assigning numbers or symbols to characteristics of things according to rules
- Descriptive statistics provide a concise description of a collection of quantitative information
- Inferential statistics make inferences from observations of a small group of people (sample) to a larger group of individuals (population)
- Magnitude refers to the property of "moreness"
- Equal intervals refer to the difference between two points at any place on the scale having the same meaning as the difference between two other points that differ by the same amount
Symmetrical Distribution
- The right side of the graph is a mirror image of the left side
- Has only one mode and it is in the center of the distribution
- Mean = median = mode
Skewness
- Refers to the nature and extent to which symmetry is absent
- Positive skewness occurs when few scores fall at the high end of the distribution
- Mean < median < mode
- Negative skewness occurs when relatively few scores fall at the low end of the distribution
- Mean > median > mode
Kurtosis
- Refers to the steepness of a distribution in its center
- Platykurtic distributions are relatively flat
- Leptokurtic distributions are relatively peaked
- Mesokurtic distributions are somewhere in the middle
Standard Scores
- Raw scores that have been converted from one scale to another scale
- Z-scores are results from the conversion of a raw score into a number indicating how many standard deviation units the raw score is below or above the mean of the distribution
- T-scores are a scale with a mean set at 50 and a standard deviation set at 10
Error
- Refers to the collective influence of all the factors on a test score or measurement beyond those specifically measured by the test or measurement
- Degree to which the test score/measurement may be wrong, considering other factors
Scales of Measurement
- Nominal scales involve classification or categorization based on one or more distinguishing characteristics
- Ordinal scales involve rank ordering on some characteristic
- Interval scales contain equal intervals, has no absolute zero point
- Ratio scales have a true zero point
Distribution
- Defined as a set of test scores arrayed for recording or study
- Raw scores are a straightforward, unmodified accounting of performance that is usually numerical
- Frequency distribution lists all scores alongside the number of times each score occurred
Post-Hoc Tests
- Used in ANOVA to determine which mean differences are significantly different
- Tukey's HSD test allows the computation of a single value that determines the minimum difference between treatment means that is necessary for significance
Measures of Central Tendency
- Statistics that indicate the average or midmost score between the extreme scores in a distribution
- Goal is to identify the most typical or representative of the entire group
- Mean is the average of all the raw scores
- Median is the middle score of the distribution
- Mode is the most frequently occurring score in the distribution
Variability
- An indication of how scores in a distribution are scattered or dispersed
- Measures of variability describe the amount of variation in a distribution
- Range is equal to the difference between the highest and the lowest score
- Quartile is a dividing point between the four quarters in the distribution
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Description
This quiz assesses understanding of item difficulty and p-value in psychological assessment, covering concepts such as variance, true differences, and error. It is based on sources from Cohen & Swerdlik, Kaplan & Saccuzzo, and Groth & Wright.