History of Norm-Referenced Testing

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

What was the major revision of the intelligence test known as in 1916?

Stanford-Binet Intelligence Scale (correct)
Rorschach Personality Test
Minnesota Multiphasic Personality Inventory
Wechsler Adult Intelligence Scale

Which type of data can have a meaningful zero and allows for all arithmetic operations?

Interval Data
Ordinal Data
Ratio Data (correct)
Nominal Data

What was a significant factor that led to the emergence of new fields of psychology in the 1980s?

Development of personality tests
Decrease in psychological assessments
Criticism of projective tests
Expansion of statistical sciences (correct)

What type of data includes values that cannot be fractions and is considered a subtype of continuous data?

Discrete Data (A) Signup and view all the answers

Which principle is NOT a basis for distinguishing between four major variable types?

Variance (C) Signup and view all the answers

What is a characteristic of norm-referenced tests?

Scores are compared to a reference sample. (A) Signup and view all the answers

Which type of data cannot accurately determine the distance between categories?

Ordinal Data (B) Signup and view all the answers

What does not apply to interval data?

Can apply all arithmetic operations (C) Signup and view all the answers

Who is known as the father of psychometrics?

Sir Francis Galton (B) Signup and view all the answers

During which period did personality tests begin to be implemented widely?

1920s-1930s (B) Signup and view all the answers

Which development occurred in the Zhou Dynasty related to testing?

Testing expanded to include assessments of conduct and manners. (D) Signup and view all the answers

What was the significant achievement of the Binet-Simon Scale?

Created a standardized sample of 50 children. (B) Signup and view all the answers

Which of the following figures is considered a key name in German Psychophysics?

Gustav Fechner (C) Signup and view all the answers

What aspect did Charles Darwin emphasize in 'The Origin of Species'?

The role of individual differences in survival. (C) Signup and view all the answers

What was one of Galton's contributions to psychological measurement?

Development of psychometry. (A) Signup and view all the answers

How did the Binet-Simon Scale evolve over time?

It increased in the number of items and standard sample size. (C) Signup and view all the answers

Which type of variable is unsuitable for applying mathematical operations?

Nominal Variable (B) Signup and view all the answers

What does additivity assume in measurement?

Unit sizes remain constant while counting (C) Signup and view all the answers

What is the relative frequency of a value calculated from?

Frequency of the value divided by the total observations (B) Signup and view all the answers

Which measure of central tendency represents the exact middle point of a dataset?

Median (C) Signup and view all the answers

How does high dispersion affect measures of central tendency?

They may not accurately describe data distribution (B) Signup and view all the answers

In frequency tables, what do the rows typically represent?

Categories of the nominal or interval variable (B) Signup and view all the answers

Which statement about the mean is accurate?

It is calculated by summing all values and dividing by the sample size. (C) Signup and view all the answers

What does a higher range indicate in a dataset?

Greater dispersion among scores (B) Signup and view all the answers

What is the primary purpose of reporting your norm sample information?

To provide evidence of representativeness (A) Signup and view all the answers

What does a z-score of 0 indicate?

The score is exactly at the mean (B) Signup and view all the answers

How is a T-score calculated from a z-score?

Multiply the z-score by 10 and add 50 (A) Signup and view all the answers

What range can a z-score typically fall within?

-3.00 to 3.00 (C) Signup and view all the answers

What does a T-score of 60 indicate?

The score is high compared to the standardization group (A) Signup and view all the answers

How is a stanine score calculated from a z-score?

Multiply the z-score by 2, add 5 and round (C) Signup and view all the answers

What does a standard deviation represent in a norm sample?

The degree of variation in the scores (A) Signup and view all the answers

What is the purpose of standardized scores?

To make scores interpretable without statistical knowledge (A) Signup and view all the answers

What is the main purpose of including distractors in a multiple-choice test?

To reduce the probability of guessing the correct answer (B) Signup and view all the answers

What is considered optimal for reliability in multiple-choice tests according to psychometric theory?

Four response options with three distractors (B) Signup and view all the answers

What happens to a test taker's score when a correction for guesswork is applied?

Some correct answers may be deducted to account for chance (B) Signup and view all the answers

Which of the following is a disadvantage of multiple-choice tests?

Writing good items requires significant skill and time (C) Signup and view all the answers

How does the number of good distractors impact the reliability of a test item?

Two good distractors are typically sufficient for reliability (A) Signup and view all the answers

When is it statistically beneficial to guess on a chance-corrected test?

When two response options remain after eliminating others (D) Signup and view all the answers

What is the purpose of rating-scale items in assessments?

To gauge attitudes on a relative scale (C) Signup and view all the answers

What can negatively affect the reliability of a multiple-choice test?

Bad distractors that are easily identifiable (A) Signup and view all the answers

Flashcards are hidden until you start studying

Study Notes

Norm-referenced Tests

Scores are benchmarked against a reference sample rather than an absolute standard.
Test takers should belong to the reference population for accurate comparison.

Historical Context of Measurement

Testing practices trace back 4000 years to China for talent selection.
Confucius emphasized individual differences despite similar natures.
Mencius asserted measurable differences among individuals existed.
During the Xii Dynasty, skill competitions were used for officer selection; tests later included assessments of conduct in the Zhou Dynasty.

Darwinian Roots

Charles Darwin, in 1859, highlighted individual differences in animal species.
Sir Francis Galton suggested measurable individual traits related to survival fitness.
Galton pioneered mental difference measurement and introduced the term psychometry, foundational to psychometrics.
Although most of Galton's measures were flawed, he is recognized as the father of psychometrics, significantly influencing statistical applications in psychological measurement.

German Psychophysics

Focused on identifying thresholds in physical stimuli perception.
Key figures: Herbert, Weber, Fechner, and Wundt, the latter known as the father of Psychology.
Their work influenced the development of behavioral tests and psychometrics.

Early Intelligence Testing

1905: The Binet-Simon Scale became the first standardized intelligence test with a sample of 50 children.
By 1908, the sample size grew to 200, and the item count doubled.
The 1916 revision led to the Stanford-Binet Intelligence Scale, with a sample size of 1,000.

Personality Tests (1920-1940s)

Personality assessments emerged during World War II to evaluate stable traits.
This period faced significant criticism and led to the development of projective tests like the Rorschach Test after the war.

Resurgence in Testing (1980s)

Growth in fields such as neuropsychology, health psychology, and forensic psychology revitalized test development.
Advances in statistical sciences furthered measurement theories.

Definitions of Measurement Types

Ratio Data: Continuous; zero is meaningful; allows all arithmetic operations (e.g., personal income, age).
Interval Data: Continuous; zero is not meaningful; allows only addition and subtraction (e.g., Fahrenheit temperature).
Discrete Data: Whole numbers only; values cannot be fractions (e.g., number of children).
Ordinal Data: Categorical; ranks categories; zero is not meaningful (e.g., Likert scales).
Nominal Data: Categorical; no ranking; zero is not meaningful (e.g., ethnicity, marital status).

Units of Analysis

Measurement must be defined to ensure accurate comparisons; psychological measurements assume equal distances.
Additivity is crucial, maintaining consistent size of measurement units during analysis.

Frequency Tables

Univariate tables consist of rows for categories and columns for frequency and percentage data.
Crosstabulations involve both row and column variables presenting various statistics.

Measures of Central Tendency

Common measures: Mean (average), Median (middle value), Mode (most frequent value).

Dispersion

Dispersion illustrates value variation; high dispersion indicates significant score differences.
Range is the simplest dispersion measure.

Multiple-choice Testing

Comprises a statement with correct and distractor responses; typically features four or five options.
Good distractors are crucial to test reliability; research suggests four-response options optimize reliability.

Correction for Chance

Scores can be adjusted for guessing; corrected scores account for likely guessed responses.
Optimal to avoid guessing unless only two options remain.

Rating-Scale Items

Use ordinal responses often on a 7-point scale; effective with a large sample size.

Reporting Norm Samples

Report characteristics of the norm group, including mean scores and standard deviation.
Ensure representativeness of the sample.

Standard Scores

Calculate a z-score to determine how a score compares to the mean, indicating standard deviations from it (range: -3.00 to 3.00).
A z-score of zero is average; positive indicates above average while negative indicates below average.

Standardized Scores

T-scores transform z-scores for easier comprehension; calculated by multiplying z-score by 10 and adding 50.
Stanines convert z-scores into a 1-9 scale.

IQ Test Standardized Scores

IQ tests standardize scores by multiplying z-scores by 15 and adding 100 for interpretation.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.