Test Scores: Norms and Standardization

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

What key concepts facilitate the interpretation of test/scale scores?

  • Objectivity and standardization
  • Norms and reliability (correct)
  • Sensitivity and specificity
  • Validity and practicality

Norms provide a comparative frame of reference to make sense of an individual's test score.

True (A)

What does reliability indicate about test scores?

consistency or repeatability

What does a norm-referenced test primarily aim to do?

<p>Classify examinees from low to high based on their scores. (C)</p> Signup and view all the answers

Raw scores obtained from psychological tests are meaningful in isolation.

<p>False (B)</p> Signup and view all the answers

Norms indicate an examinee's standing relative to the performance of others from the same ______, gender, etc.

<p>age</p> Signup and view all the answers

Match the following test types with their primary purpose:

<p>Norm-Referenced Tests = Classify examinees relative to each other Criterion-Referenced Tests = Compare accomplishments to a performance standard</p> Signup and view all the answers

What is the main purpose of criterion-referenced tests?

<p>To compare an individual's performance against a set standard or criterion. (B)</p> Signup and view all the answers

Criterion-referenced tests select content based on its relevance to the curriculum.

<p>True (A)</p> Signup and view all the answers

In the context of criterion-referenced tests, what is being assessed?

<p>mastery or non-mastery of specific behavior</p> Signup and view all the answers

Which of the following is considered a measure of central tendency?

<p>Mean (B)</p> Signup and view all the answers

Histograms represent frequency tables visually using line graphs.

<p>False (B)</p> Signup and view all the answers

In a histogram, bars represent the interval of the frequency table, while the ______ of each bar indicates the frequency.

<p>height</p> Signup and view all the answers

What is the first step in constructing a histogram?

<p>Creating a frequency table. (B)</p> Signup and view all the answers

In frequency polygons, data points are plotted at the top of each interval.

<p>False (B)</p> Signup and view all the answers

What two elements must a frequency table include for the construction of frequency polygons?

<p>interval midpoints and frequencies</p> Signup and view all the answers

What do measures of relative standing primarily provide?

<p>Information about where a particular score falls in relation to others. (A)</p> Signup and view all the answers

A Z-score represents an observation's distance from the median.

<p>False (B)</p> Signup and view all the answers

A Z-score measures the distance from the mean in ______ units.

<p>standard deviation</p> Signup and view all the answers

What is the formula for calculating a Z-score?

<p>$z = \frac{X - \bar{X}}{SD}$ (C)</p> Signup and view all the answers

In a Z-score distribution, the mean is always 1 and the standard deviation is always 0.

<p>False (B)</p> Signup and view all the answers

If a Z-score is zero, where does it fall relative to the mean?

<p>on the mean</p> Signup and view all the answers

What percentage of subjects fall within one standard deviation of the mean in a normal distribution?

<p>Approximately 68% (D)</p> Signup and view all the answers

In a normal distribution, the mean, mode, and median are always different.

<p>False (B)</p> Signup and view all the answers

Data that do not form a normal distribution and have most scores on the high end, result in a distribution that is ______ skewed.

<p>negatively</p> Signup and view all the answers

What is the characteristic of the normal curve when data are skewed?

<p>Skewed data do not posses the characteristics of the normal curve (B)</p> Signup and view all the answers

With a normal distribution and a data point two standard deviations above the mean, a Z-score can be used to determine percentile.

<p>True (A)</p> Signup and view all the answers

What is Stanine a method of scaling test scores on?

<p>9-point standard scale</p> Signup and view all the answers

What are T scores in Psychological Testing?

<p>They are normally distributed scores with a mean of 50 and a standard deviation of 10. (A)</p> Signup and view all the answers

In the Stanine Scale, z-scores can be expressed with decimals, while a Stanine is whole number from 0 to 9.

<p>True (A)</p> Signup and view all the answers

The percentile rank tells you the ______ of scores in a reference group that fall below a particular raw score.

<p>percentage</p> Signup and view all the answers

Which of the following describes age norms?

<p>Facilitates same-aged comparisons. (A)</p> Signup and view all the answers

Local Norms derived are for nationally representative groups.

<p>False (B)</p> Signup and view all the answers

Define what reliability means.

<p>consistency in measurement</p> Signup and view all the answers

According to Classical Test Theory, what does 'X' represent in the equation X = T + e?

<p>The observed score (A)</p> Signup and view all the answers

According to Classical Test Theory, the error 'e' can be 0.

<p>False (B)</p> Signup and view all the answers

According to Classical Test Theory, in the equation X = T + e, T is the ______ score.

<p>true</p> Signup and view all the answers

Which of the following is an example of systematic measurement error?

<p>A test consistently measuring something other than the trait it is intended to. (A)</p> Signup and view all the answers

Item selection is not considered a source of measurement error.

<p>False (B)</p> Signup and view all the answers

Does random error affect the average?

<p>no</p> Signup and view all the answers

Match each reliability type with its corresponding procedure.

<p>Test-retest reliability = Administer the identical test twice to the same group Alternate-forms reliability = Two different forms of the same test Split-Half Reliability = Correlate the pairs of scores from equivalent halves in a test</p> Signup and view all the answers

Flashcards

What are norms?

Distribution of scores by a well-established group.

What are raw scores?

Basic information provided by a psychological test.

What are Norm-Referenced Tests?

Test that classifies examinees from low to high performance.

What are Criterion-Referenced Tests?

Test that compares accomplishments to a performance standard.

Signup and view all the flashcards

What is a Z-score?

Measure of an observation's distance from the mean in standard deviation units.

Signup and view all the flashcards

What is Stanine?

Scaling test scores on 9-point standard scale.

Signup and view all the flashcards

What are T scores?

Normally distributed scores with mean of 50, SD of 10.

Signup and view all the flashcards

What is Percentile Rank?

Percentage of scores in a reference group below a raw score.

Signup and view all the flashcards

What are Age Norms?

Compares examinee's performance to others of the same age.

Signup and view all the flashcards

What are Grade Norms?

Describes performance level for each separate grade.

Signup and view all the flashcards

What are Local Norms?

Derived for representative local examinees.

Signup and view all the flashcards

What are Subgroup Norms?

Scores obtained from an identified subgroup.

Signup and view all the flashcards

What is Reliability?

Consistency in measurement.

Signup and view all the flashcards

What is the Classical Test Theory?

Observed score = true score + error.

Signup and view all the flashcards

What is Item Selection Error?

Which items? In which wording? Fair to examinees?

Signup and view all the flashcards

What is Test Administration Error?

Uncomfortable room, dim lighting, noise causes this.

Signup and view all the flashcards

What is Test Scoring Error?

Judgment required in scoring non-multiple choice tests.

Signup and view all the flashcards

What is a Random Error?

Any factor that randomly affects measurement across the sample.

Signup and view all the flashcards

What is Systematic Error?

Test consistently measures something other than the intended trait.

Signup and view all the flashcards

What is the Reliability Coefficient?

Ratio of true score variance to total variance of test scores.

Signup and view all the flashcards

What is the Correlation Coefficient?

Expresses the degree of linear relationship between two sets of scores.

Signup and view all the flashcards

What is Test-Retest Reliability?

Administer identical test twice, correlate the results.

Signup and view all the flashcards

What is Alternate-Forms Reliability?

Two different forms of test are given; scores are correlated.

Signup and view all the flashcards

What is Split-Half Reliability?

Correlate pairs of scores from equivalent halves of a test.

Signup and view all the flashcards

What is Coefficient Alpha?

Average of all possible split-half coefficients.

Signup and view all the flashcards

What is Inter-Rater Reliability?

Tests scored by two examiners; scores are correlated.

Signup and view all the flashcards

What are Norms?

Distribution of scores on a particular test by a well established group of people

Signup and view all the flashcards

What is a Frequency Table?

Blood pressure readings intervals with frequencies of individuals with each interval

Signup and view all the flashcards

What are Histograms?

A bar graph displaying intervals of a frequency table with the height indicating frequency.

Signup and view all the flashcards

What are Frequency Polygons?

Visually represent the data that includes interval midpoints and frequencies.

Signup and view all the flashcards

Study Notes

  • Two key concepts are needed to interpret test/scale scores: norms and reliability
  • Norms help make sense of an individual's score using comparative frames of reference
  • Reliability indicates whether test scores are consistent or repeatable

Part I: Norms

  • Norms represent the distribution of scores on a test from a well-established group
  • Norms indicate an examinee's standing relative to others of the same age, gender, etc.
  • A representative, large, and heterogeneous sample is selected to develop the norms

Norms and Test Standardization

  • Raw scores
  • Criterion-referenced tests
  • Essential statistical concepts
  • Raw score transformations
  • Selection of a norm group

Raw Scores

  • Raw scores are the basic level of information from a psychological test
  • Raw scores are meaningless in isolation
  • Raw scores becomes meaningful when compared with others' scores
  • Raw scores becomes meaningful when related to norms obtained by a representative sample
  • Norms help determine if an obtained score is low, average, or high
  • All norms statistically summarize a large body of scores

Norm vs. Criterion-Referenced Tests

  • Norm-Referenced Tests classify examinees from low to high
  • Norm-Referenced Tests use a representative sample of individuals
  • Items are chosen to provide maximal discrimination
  • Example: IQ tests, which determine if a test taker is more intelligent than others
  • Criterion-Referenced Tests compare accomplishments to a performance standard (%)
  • Content is selected based on relevance in the curriculum
  • Criterion-Referenced Tests identify an examinee's mastery or non-mastery of specific behavior
  • Example: a driving test

Essential Statistical Concepts

  • Measures of Central Tendency: mean, median, mode
  • Measures of Variation: standard deviation (SD), variance
  • Raw data are difficult to display,so summarizing and organizing it in a meaningful pattern is needed
  • Frequency tables, histograms, and frequency polygons visually represent the data

Frequency Tables & Histograms

  • A frequency table is needed to build a histogram
  • A frequency table specifies a small number of equal-sized intervals, counting how many scores fall within each interval
  • Histograms display the developed frequency tables
  • Histograms are bar graphs: bars represent the interval of the frequency table, the height of each bar indicates the frequency

Constructing Frequency Polygons

  • A frequency table that includes interval midpoints and frequencies is needed
  • Dots are placed above each interval midpoint at the height of the class frequency
  • Two dots are put on the horizontal axis, one before the first point and one after the last point
  • These two new points are not in the table so the frequency of each is 0
  • Dots are connected with straight lines

Measures of Relative Standing

  • Used to provide information about where a score falls in relation to other scores in a distribution
  • Commonly used measures include: Z-Scores, Normal Distribution, Stanine, T Scores

Z-Scores

  • A measure of an observation's distance from the mean, measured in standard deviation units
  • Z = (X - XÌ„) / SD
  • Z-scores are advantageous when grades of two students in different classes needs to be compared
  • To find a z-score, subtract the mean and divide by the standard deviation
  • If a z-score is zero, it's on the mean
  • If a z-score is positive, it's above the mean
  • If a z-score is negative, it's below the mean
  • A z-score of 1 is 1 SD above the mean
  • A z-score of -2 is 2 SDs below the mean
  • The mean of all z-scores for a sample is 0, and the SD is 1

Normal Distribution

  • The scores of a quiz of 30 students is measured and graphed
  • As the scores continue to plot, a pattern emerges
  • As scores become larger or smaller, there are fewer people with that measurement
  • Most measurements tend to fall in the middle, with fewer approaching the high and low extremes
  • Smoothing the lines creates a bell-shaped curve
  • This bell-shaped curve is known as the "Bell Curve" or the "Normal Curve."term scores for 51 students in research skills created a curve
  • Mean, mode, and median will fall on the same point
  • Normal distributions are a family of distributions that have the same general shape
  • Normal distributions are symmetric, the left side is a mirror of the right
  • Scores are more concentrated in the middle than in the tails
  • Normal distributions differ in how spread out they are, but the area under each curve is the same
  • If data fits a normal distribution, about 68% of subjects will fall within one standard deviation from the mean
  • 95% will fall within two standard deviations
  • Over 99% will fall within three standard deviations

Standard Deviation

  • The mean and standard deviation describe a set of scores
  • Scores grouped closely together have a smaller standard deviation than scores spread farther apart
  • If raw scores distribute normally (like midterm score data with a mean of 17 and an SD of 2.24), the data has the predicted properties
  • A subject's raw score and the mean/SD, can determine their standardized score if the distribution of scores is normal
  • Standardized scores are useful when comparing a student's performance across different tests, or when comparing students
  • The number of points that one standard deviation equals varies from distribution to distribution
  • Skew refers to the tail of the distribution
  • In a negative skew, the tail is on the negative (left) side of the graph,
  • In a positive skew, the tail is on the positive (right) side of the graph

Using the Normal Distribution

  • Skewed data do not possess the characteristics of the normal curve
  • In skewed data the mean, mode, and median do not fall on the same score
  • The mode is represented by the highest point, mean is toward the side with the tail, and the median falls between the mode and mean
  • Fatima's' score standing can be determined on the normal distribution, to compare results
  • Six categories cover 99% of students: very weak, weak, below average, above average, good, very good

Stanine

  • Stanine (STAndard NINE) scales test scores on a 9-point standard scale
  • Stanines are similar to normal distributions, where scores are a bell curve sliced into 9 pieces
  • The mean of Stanines is 5 and the SD is 2
  • Stanines assign a number to a member of a group, relative to all members in that group
  • Whole positive numbers are used
  • 9 ratings: Bottom 4%, next bottom 7%, next bottom 12%, next bottom 17%, middle 20%, next top 17%, next top 12%
  • A person with a score of 9 is in the top 4% of scorers, while a person with a score of 1 is in the bottom 4%

Z-Scores vs Stanine

  • Z-scores can be expressed with decimals while stanines are always positive whole numbers from 0 to 9
  • Two scores in a stanine can be farther apart than two scores in adjacent stanines which reduces their value

T Scores

  • T scores are normally distributed with a mean of 50 and an SD of 10
  • T scores can be found if the distribution of scores is normal
  • Converting each score to a Z score
  • Multiplying the Z by 10
  • Adding or subtracting (depending on the sign of Z) to or from 50

Converting Z-Scores

  • Once there is a set of z-scores, converting to any other scale can be done
  • New score = Z-score (SD of new scale) + mean of the new scale

Percentile Ranks

  • Percentile Rank tells the percentage of scores in a reference group that fall below a particular raw score
  • 93rd percentile means that 93 percent of the scores in the reference group fall equal or below the score
  • Percentile ranks have ordinal measurement properties and can be calculated using SPSS

Types of Standard Scores

  • Mean and SD are used to determine the type of standard scores

Selecting a Norm Group

  • Age and grade norms
  • Local and subgroup norms

Age and Grade Norms

  • Age norms facilitate same-aged comparisons
  • Examinee performance is compared with standardization subjects of the same age
  • The age span can vary from a few months to a decade
  • Grade norms describe the level of performance for each grade, useful for school settings

Local and Subgroup Norms

  • Local norms are derived for representative local examinees as opposed to a national sample (state norms vs. national norms
  • Subgroup norms consist of scores from an identified subgroup (African American, Females,...)

Part II: Reliability

  • Reliability is consistency in measurement
  • A measure has high reliability if it produces similar results under consistent conditions
  • Continuum ranges from minimal consistency to near perfect
  • Reliability indicates accuracy which is related to measurement error
  • Classical test theory of measurement

Classical Test Theory

  • Factors that contribute to consistency- factors related to the attribute being measured
  • Factors that contribute to inconsistency- factors related to the individual or the test (but NOT the attribute being measured)
  • X = T + e (X is observed score, T is true score, e is error)
  • e = X-T
  • e cannot be 0 and T value is unknown

Sources of Measurement Error

  • Item Selection: Items selected in tests might not be fair to all examinees
  • Test Administration: uncomfortable room temperature, dim lighting, noise, examinees' motivation, anxiety, concentration, and fatigue
  • Test Scoring: non-multiple choice scoring requires judgment, especially with essays
  • Systematic Measurement Error

Types of Error

  • All discussed errors were unsystematic measurement error (random error)
  • Systematic error: a test consistently measures something other than the trait intended
  • X = T + es + eu where es is the systematic error and eu is the unsystematic error
  • Random error is caused by any factor that randomly affects measurement of the variable across the sample
  • Random error does not have any consistent effects across the entire sample.
  • Random error pushes scores up or down randomly
  • In a distribution they would have
  • Systematic error affects the average (bias)

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Use Quizgecko on...
Browser
Browser