Podcast Beta
Questions and Answers
What is face validity primarily concerned with?
Which type of validity evaluates the relationship of scores obtained on the test to scores on other tests?
What type of validity concerns whether a test predicts later performance on a related criterion?
Which of the following best describes construct validity?
Signup and view all the answers
What does diagnostic utility refer to in the context of test utility?
Signup and view all the answers
In evaluating criterion validity, which measure relates to an existing similar measure?
Signup and view all the answers
How does treatment utility of a test specifically differ from other utilities?
Signup and view all the answers
What is a key consideration of test utility in comparing tests?
Signup and view all the answers
What type of reliability is assessed by correlating pairs of scores from the same individuals taken at two different times?
Signup and view all the answers
What formula is typically used to assess split-half reliability?
Signup and view all the answers
Which method of assessing reliability involves using different versions of a test that measure the same construct?
Signup and view all the answers
What is the focus of inter-scorer reliability?
Signup and view all the answers
What is the purpose of Coefficient Alpha in reliability assessment?
Signup and view all the answers
What does reliability refer to in psychometric properties?
Signup and view all the answers
Which type of validity assesses the relevance of items in a test based on appearance?
Signup and view all the answers
What characterizes test-retest reliability with regard to the time gap between administrations?
Signup and view all the answers
Which of the following is a method used to measure inter-item consistency?
Signup and view all the answers
What does the validation process primarily involve?
Signup and view all the answers
Which type of validity involves evaluating the relationship of test scores to other measures?
Signup and view all the answers
What is the main factor considered in parallel forms reliability?
Signup and view all the answers
Local validation studies are crucial when a test user plans to:
Signup and view all the answers
What is the primary focus of predictive validity?
Signup and view all the answers
What aspect of validity does content validity pertain to?
Signup and view all the answers
Which statement best characterizes validity in the context of testing?
Signup and view all the answers
What does concurrent validity measure?
Signup and view all the answers
Which of the following is an example of predictive validity?
Signup and view all the answers
What is the validity coefficient?
Signup and view all the answers
Which correlation method is most appropriate for ordinal data?
Signup and view all the answers
Construct validity involves which of the following?
Signup and view all the answers
Which of the following constructs might be evaluated using construct validity?
Signup and view all the answers
Which type of evidence is used to support construct validity?
Signup and view all the answers
Which statement is true regarding the Pearson correlation coefficient?
Signup and view all the answers
What is the primary purpose of age-equivalent scores?
Signup and view all the answers
Which type of norm indicates the average test performance for a specific school grade?
Signup and view all the answers
What do national norms aim to represent?
Signup and view all the answers
Which type of norm provides information on local populations' test performances?
Signup and view all the answers
What does fixed reference group scoring rely on for test score calculations?
Signup and view all the answers
Criterion-referenced testing focuses on which aspect of evaluation?
Signup and view all the answers
What is the main characteristic of national anchor norms?
Signup and view all the answers
Which of the following describes developmental norms?
Signup and view all the answers
Study Notes
Psychometric Properties
- The psychometric properties are the technical qualities of a test or other assessment tool, and they determine its usefulness
- The three primary psychometric properties for evaluating a test are reliability, validity, and utility.
- Reliability refers to the consistency of measurement, validity refers to what a test measures, utility refers to the usefulness or practical value of a test.
Validity
- Validity is a judgment or estimate of how well a test measures what it purports to measure in a particular context
- Validity is a judgment based on evidence about the appropriateness of inferences drawn from test scores
- No test is “universally valid” for all time, for all uses, with all types of test taker populations
- Validation is the process of gathering and evaluating evidence about validity
- Local validation studies are necessary when a test is altered in some way or if it is used with a population that differs from the population on which the test was standardized.
Types of Validity
- Face validity: a judgment concerning how relevant the items appear to be
- Content validity: a measure of validity based on an evaluation of the subjects, topics, or content covered by the items in the test
-
Criterion validity: a measure of validity obtained by evaluating the relationship of scores obtained on the test to scores on other tests or measures. There are two types of criterion validity:
- Predictive validity: Does the test predict later performance on a related criterion?
- Concurrent validity: Does the test relate to an existing similar measure?
-
Construct validity: a measure of validity based on whether the test measures the theoretical constructs it was intended to measure. There are two major types of construct validity:
- Convergent validity: the degree to which a test “converges” on other tests that should be measuring the same thing
- Divergent validity: the degree to which a test “diverges” on other tests that should be measuring different things
Test Utility
- Test utility refers to the practical value of using a test to aid in decision making
- The use of a test for decision making is considered to be of practical value when it helps with identifying and classifying individuals
- The comparison of test results with other tests measures the test’s relative usefulness
- The treatment utility of a test refers to whether its results lead to better intervention outcomes
- The diagnostic utility of a test refers to how useful it is for classification purposes.
Reliability
- The consistency of measurement is also known as reliability
- There are different types of reliability, including test-retest reliability, parallel forms reliability, internal consistency, and inter-scorer reliability
Types of Reliability
- Test-Retest Reliability: an estimate of reliability obtained by correlating pairs of scores from the same people on two different administrations of the same test
- Parallel-Forms & Alternate forms Reliability: uses one set of questions divided into two equivalent sets (“forms”), where both sets contain questions that measure the same construct. The two sets of questions are given to the same sample of people within a short period of time and an estimate of reliability is calculated from the two sets.
-
Internal Consistency: This refers to the degree to which all the items on a test are measuring the same thing, including:
- Split-half reliability: obtained by correlating two pairs of scores obtained from equivalent halves of a single test administered once
- Inter-item consistency: degree of correlation among all the items on a scale
- Coefficient Alpha: the mean of all possible split-half correlations
- McDonald’s (1978) omega when the test loadings are unequal
- Inter-scorer Reliability: The degree of agreement or consistency between two or more scorers with regard to a particular measure.
Test Norms
- Norms are used to compare an individual's test score to the performance of a larger group
- The normative sample is a group of test takers who have taken the test before and whose scores are used to establish the norm
- The types of norm groups include:
- Age norms: the average performance of different samples of testtakers at different ages
- Grade norms: the average performance of testtakers in a given school grade
- Developmental norms: the average test performance of testtakers at different stages of life
- National norms: norms derived from a sample representative of the national population
- National anchor norms: norms that are anchored to other specific test scores
- Subgroups norms: norms segmented by any of the criteria used in selecting subjects for the sample
- Local norms: provide normative information with respect to the local population’s performance on some test
Fixed Reference Group Scoring Systems
- The distribution of scores obtained on the test from one group of testtakers is used as the basis for the calculation of test scores for future administrations of the test
Criterion-Referenced vs Norm-Referenced Evaluation
- Norm-referenced evaluations compare an individual's score to the performance of a larger group as in a standardized test.
- Criterion-referenced evaluation compares an individual’s score to a pre-determined standard or benchmark
- Criterion-referenced evaluations are often used in educational and training settings to assess specific skills or knowledge deficits
- Norm-referenced tests are used to make relative comparisons among test takers, such as in college admission tests or assessments for graduate school.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
Explore the fundamental concepts of psychometric properties, focusing on reliability, validity, and utility as key evaluation criteria for tests. Understand the importance of validity and the necessity for local validation studies depending on the population and context. This quiz will deepen your knowledge of effective assessment tools.