AI MCQ Study Content2 - Standardized Testing PDF

Standardized Testing Standardized tests have become the most commonly hated assessments in the country. By the end of this lesson the goal is to help you understand why they are in fact needed, what they are good for, how to interpret them, and how teachers can use their results to help students. EXPLAIN What is Standardized about Standardized Testing? Standardization indicates the exam is administered, scored and interpreted the same no matter when or where it is given. Specifically, each student takes the examination under the same “standard” conditions. Standardization DOES NOT refer to the specific items on the test—but to the conditions in which the test was given. Students need to be given the same amount of time, same instructions, and same assistance provided to ALL students taking the test at ALL locations otherwise the test is not standardized. Think About It Scenario 1 Sally is taking a standardized test. She asks Ms. Jones, “What does this word mean?” Ms. Jones replies, “Remember when we read that story in class about the dog? Now what might it mean?” How might Ms. Jones’ response impact the standardization of the test? Scenario 2 Sally is taking a standardized test. Ms. Jones is walking around the room and says to Sally, “Hmmm. Sally, I’d check that answer if I were you.” How might Ms. Jones’ response impact the standardization of the test? Breaking the standard conditions of the testing process is cheating— just as if we took an eraser and changed the answers. Types of Standardized Tests There are two types of standardized tests: Achievement and Aptitude. The two types are often discussed as if they were interchangeable, but they have very different purposes. Achievement Tests are designed to measure a student’s mastery. What has the student learned in the past and what can the student perform in the present? Examples: Ohio Achievement Tests (OATs), Ohio Graduation Test (OGT), Iowa Test of Basic Skills (ITBS) Asks about content areas such as math and reading in order to understand student’s skills in these areas. Aptitude Tests are designed to measure how much promise a student has for the future. We are using test results to predict the future statistically. Examples: Scholastic Aptitude Test (SAT), ACT, PSAT Asks about content areas such as math and reading only because that content is universally known and people who answer the items correctly generally have more of the predicted trait. Module 9: Standardized Testing Page 2 of 12 Think About It The SAT (Scholastic Aptitude Test) includes Math, Writing, and Critical Reading sections. What do you suppose the SAT is attempting to predict? Reporting Standardized Test Scores Standardized test scores may be reported as Criterion-Referenced or Norm Referenced. The table below illustrates the differences between these two methods of reporting test results. Methods of Reporting Criterion-Referenced Test Scores Relatively easy to understand and used less frequently for standardized test score reporting. Number and Percent Correct – 9/10 = 90% Performance Speed – Measures the time it takes to complete a task. Performance Quality – Measures the level of performance excellence. Performance Precision – Measures the degree of accuracy with which students complete a task. Methods of Reporting Norm Referenced Test Scores Raw Scores – reported but not very useful since the goal of Norm-Referenced scoring is to compare students with each other. Percentile Ranks Statistically derived and based on Development/Growth Scales The Normal Distribution of scores. Standardized Scores Module 9: Standardized Testing Page 3 of 12 Standard Deviation is the average distance an individual score is away from the mean. Allows us to see how high or low a given score is compared to the Normal Distribution scores. Properties of the Normal Distribution 1. Symmetrical (left & right are mirror images). 2. Mean, median, & mode are all in the same place (center). 3. The % of people in each Standard Deviation is known precisely. We know that there are approximately 68% of the population between ±1 SD; 96% of the population will be between ±2 SD; 99% of the population will fall between ±3 SD; only 1% of the population falls outside ±3 SD. Percentile Ranks A score in the 85th percentile Interpretation: 85% of the individuals who also took the examination scored lower than this person. DOES NOT mean the person correctly answered 85% of the items. Frequently reported on standardized norm-referenced tests. Percentile ranks range from 1-99. Notice that Percentile Ranks are not equal distance apart. Scores in the outside (tails) are further apart than scores in the middle of the distribution. Check Your Understanding 1. Why don’t percentile ranks range from 1-100? Why do they only go up to the 99th percentile? 2. Look at the Stanford 10 test. Percentile Ranks are found in the 4th column of scores on the top. In that column you will see PR-S. The score under PR indicates the student’s percentile rank. What was this student’s percentile rank in Total Reading? What does this mean? Module 9: Standardized Testing Page 4 of 12 Grade Level Equivalents Sally is in the 9th grade. She receives standardized test scores in grade level equivalents. Her score is reported as 7.8. What does this mean? The 7 tells us the grade level (grade 7) and the.8 tells us the month (0=September & 9=June). It is assumed a typical student will gain one unit of knowledge per month and 1 unit over the summer (July & August=10). Tests that provide Grade Level Equivalents scores give their tests to multiple grade levels. So Sally scored the same on the 9th grade assessment as an average 7th grader would in their 8th month of school (May). This does not mean that Sally needs to be learning 7th grade material. We do not know how Sally would perform on a 7th grade test which tests 7th grade content. All we can say is that Sally likely needs remediation and is performing worse than her 9th grade peers in 9th grade content. Check Your Understanding A 5th grade student scores 7.8 on a 5th grade math assessment. The parent approaches you and states that she would like her child moved up to 7th grade for math instruction. What should you tell this parent? Module 9: Standardized Testing Page 5 of 12 Standardized Scores The most common presentations of norm referenced scores in standardized tests are transformations of the raw score, in units relative to the mean and standard deviation of the normal distribution. z-scores are used as a baseline for every transformation Calculating z-scores Suppose your students take a standardized math test with a mean of 10 and standard deviation of 2. Raw scores for each of your students are: Joe=6; Sue=10; and Jane=15. What are their z-scores? Advantages: z-scores are standardized based on the normal distribution of scores so they can be compared from test-to-test and student-to-student. Disadvantages: z-scores are so low and parents may not know how to interpret these. If a student gets a z-score of “0” a parent may think this means the student correctly answered 0 of the items even though this really means the student performed at the 50th percentile (scored better than 50% of the students taking the test). What might parents think about a negative z-score? t-scores are larger than z-scores and use the z-score as the baseline for transformation Calculating t-scores What are the t-scores for Joe, Sue and Jane from the last example? zJoe = -2; zSue = 0; zJane = 2.5 Module 9: Standardized Testing Page 6 of 12 More Standardized Scores Additional Standard Scores you may have heard of are the SAT, GRE, & IQ. SAT & GRE tests have a Mean of 500 and SD of 100. IQ tests have a Mean of 100 and SD or 15 (or 16) Calculation Concept is the same: Transformed Score = Mean + SD(z) Comparing Standardized Scores Check Your Understanding 1. Is an SAT score of 575 above, below or at the mean? 2. What IQ score is equivalent to a t-score of 40? 3. If you had a z-score of +2.5, what would be the equivalent GRE score? Stanines Stanines (standard nine) were created on a 9-pt scale with a Mean of 5 and SD of 2. Each Stanine is ½ of a Standard Deviation except for Stanines 1 & 9. Check Your Understanding Look at the Stanford 10 test. Stanines are found in the 4th column of scores on the top. In that column you will see PR-S. The score under S indicates the student’s Stanine. What was this student’s Stanine score in Total Math? What does this mean? What is this student’s Stanine score in Reading Vocabulary? What does this mean? Module 9: Standardized Testing Page 7 of 12 Normal Curve Equivalents (NCE) Government officials invented NCEs which are similar to Percentile Ranks, but NCEs are equal interval and Percentile Ranks are not. NCE Mean = 50, SD=21.06, Scale from 1-99 Check Your Understanding Look at the Stanford 10 test. NCEs are found in the 5th column of scores on the top. What was this student’s NCE and Percentile Rank for Reading Vocabulary? What was this student’s NCE and Percentile Rank for Environment? Remember all standard scores in one way or another simply make use of the Mean and Standard Deviation to explain performance compared to other students taking the test at the same time. Module 9: Standardized Testing Page 8 of 12 What makes up a test score? No test measures ability perfectly. Instead, all test scores are made up of two components: Ability and Standard Error (or noise) which makes scores a bit “fuzzy.” Test Score = Ability + Standard Error (SEM) SEM = The average amount of measurement error across students in the norm group. Confidence Intervals Standard Error is both added and subtracted from the student’s score to identify the range for which the “true” ability score reasonably lies within. EXAMPLE: Connor scores a 72 on a standardized Math assessment. The standard error is calculated to be 4.5. Within what range does Connor’s “true” score probably lie? Confidence Interval = 72 ± 4.5 = 67.5 to 76.5 What does a Confidence Interval of 67.5 to 76.5 mean? If we were to repeatedly test the student under the same conditions, 68% of the student’s scores would fall between 67.5 and 76.5 (within 1 SEM: 72 ± 4.5). We could get more accurate by using 2 SEM instead of 1. Thus, we would say that 96% of the time the student’s score would fall between 63 and 81 (within 2 SEM: 72 ± 9) if repeatedly tested under the same conditions. While we become more likely to predict the student’s score the student’s score range widens considerably. For this reason, in testing using 1 SEM is customary. Using Confidence Intervals Comparing Multiple Students: In the figure below, the student’s reported score is where the red line is and the box around it is the “error band.” You see that Brian’s score is the highest, then Ann, Judy, and James. However, IF the error bands overlap the students’ scores are considered statistically the same. IF the error bands do not overlap the students’ score are considered statistically different. Check Your Understanding List a pair of students who have statistically similar scores. List a pair of students who have statistically different scores. Module 9: Standardized Testing Page 9 of 12 Comparing One Student in Multiple Subjects: The figure below shows one student’s Percentile Rank scores with their error bands in Math, Reading, Science, and Social Studies. Similarly, if the error bands overlap, the student is considered statistically similar in the content areas and if the error bands do not overlap the student is statistically different in the content areas. Check Your Understanding For questions 1 & 2 use the figure above. 1. Is the student stronger in Reading or Social Studies? How do you know? 2. Is the student stronger in Math or Reading? How do you know? Look at the Stanford 10 test. National Grade Percentile Bands are found in the box furthest to the right on the top. 3. Is this student stronger in Total Reading or Total Mathematics? How do you know? 4. Is this student stronger in Language or Total Mathematics? How do you know? Module 9: Standardized Testing Page 10 of 12 Educators Need To… Assist parents and students to interpret standardized test scores. Use the data to assist with decision making in the classroom. o Revise instruction for the entire class. o Develop specific intervention strategies for individual students. Express that these are single scores on single exams, that do not represent the “totality” of a person. Check Your Understanding Look at the Stanford 10 test. Based on these test results, where is an area you might recommend remediation for this particular student? Why? Look at the Grade Card at the end of this Module. It is for the same student in the same school year. Have your thoughts about your recommendation changed now? Why or why not? Society shouldn’t make judgments solely based on a single measure—be it a standardized test or a course grade. Multiple assessments should be used to make informed decisions about people— especially when the stakes are high. Test Sample 2—Use Throughout Module Module 9: Standardized Testing Page 12 of 12

AI MCQ Study Content2 - Standardized Testing PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue