Measures for Variability PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document provides a lecture or presentation on measures of variability in psychological statistics. It covers topics such as the range, variance, and standard deviation, explaining their purpose and how to calculate them. It also discusses the use of variability within inferential statistics.
Full Transcript
Measures for Variability Psychological Statistics Lesson Objectives Discuss the purpose of variability in psychological research Compute and interpret for the different measures for...
Measures for Variability Psychological Statistics Lesson Objectives Discuss the purpose of variability in psychological research Compute and interpret for the different measures for variability. OVERVIEW Although measures of central tendency, such as the mean and median, are handy ways to summarize large sets of data, these measures do not tell the whole story. Specifically, not everyone is average. ○ Many people may perform near average, but others demonstrate performance that is far above (or below) average. In simple terms, people are different. OVERVIEW A measure of variability provides an objective description of the differences between the scores in a distribution by measuring the degree to which the scores are spread out or are clustered together. ○ Standard Deviation ○ Variance ○ Range OVERVIEW The term variability has much the same meaning in statistics as it has in everyday language; to say that things are variable means that they are not all the same. In statistics, our goal is to measure the amount of variability for a particular set of scores, a distribution. ○ In simple terms, if the scores in a distribution are all the same, then there is no variability. OVERVIEW Variability provides a quantitative measure of the differences between scores in a distribution and describes the degree to which the scores are spread out or clustered together. Specifically, it tells whether the scores are clustered close together or are spread out over a large distance. Usually, variability is defined in terms of distance. ○ It tells how much distance to expect between one score and another, or how much distance to expect between an individual score and the mean. OVERVIEW Variability provides a quantitative measure of the differences between scores in a distribution and describes the degree to which the scores are spread out or clustered together. Specifically, it tells whether the scores are clustered close together or are spread out over a large distance. Usually, variability is defined in terms of distance. ○ It tells how much distance to expect between one score and another, or how much distance to expect between an individual score and the mean. ^Less Variability ^More Variability OVERVIEW Variability measures how well an individual score (or group of scores) represents the entire distribution. ○ this aspect of variability is very important for inferential statistics, in which relatively small samples are used to answer questions about populations. Measures of Variability What are these? ○ Range ○ Variance ○ Standard Deviation The Range The Range The range is the distance covered by the scores in a distribution, from the smallest score to the largest score. When the scores are measurements of a continuous variable, ○ the range can be defined as the difference between the upper real limit (URL) for the largest score (Xmax) and the lower real limit (LRL) for the smallest score (Xmin). The Range Defining the range as the number of measurement categories also works for discrete variables that are measured with numerical scores. ○ For example, if you are measuring the number of children in a family and the data produce values from 0 to 4, then there are five measurement categories (0, 1, 2, 3, and 4) and the range is 5 points. The Range If the scores have values from 1 to 5, for example, the range is 5.5 – 0.5 5 points. When the scores are whole numbers, the range is also a measure of the number of measurement categories. ○ If every individual is classified as either 1, 2, 3, 4, or 5, then there are five measurement categories and the range is 5 points. The Range Using either definition, the range is probably the most obvious way to describe how spread out the scores are—simply find the distance between the maximum and the minimum scores. The problem with using the range as a measure of variability is that it is completely determined by the two extreme values and ignores the other scores in the distribution. The Range Because the range does not consider all of the scores in the distribution, it often does not give an accurate description of the variability for the entire distribution. For this reason, the range is considered to be a crude and unreliable measure of variability. ○ Therefore, in most situations, it does not matter which definition you use to determine the range. STANDARD DEVIATION AND VARIANCE FOR A POPULATION SD and Variance for a Population The standard deviation is the most commonly used and the most important measure of variability. Standard deviation uses the mean of the distribution as a reference point and measures variability by considering the distance between each score and the mean. ○ the standard deviation provides a measure of the standard, or average, distance from the mean, and describes whether the scores are clustered closely around the mean or are widely scattered. SD and Variance for a Population The fundamental definition of the standard deviation is the same for both samples and populations, but the calculations differ slightly. SD and Variance for a Population Deviation is distance from the mean: SD and Variance for a Population Notice that there are two parts to a deviation score: the sign ( + or –) and the number. The sign tells the direction from the mean—that is, whether the score is located above (+ ) or below (–) the mean. The number gives the actual distance from the mean. ○ For example, a deviation score of –6 corresponds to a score that is below the mean by a distance of 6 points. The sum of squared deviations (SS) SS, or sum of squares, is the sum of the squared deviation scores. The first of these formulas is called the definitional formula because the symbols in the formula literally define the process of adding up the squared deviations: ○ Sum of Squares Break Muna STANDARD DEVIATION AND VARIANCE FOR SAMPLES STANDARD DEVIATION AND VARIANCE FOR SAMPLES The goal of inferential statistics is to use the limited information from samples to draw general conclusions about populations. The basic assumption of this process is that samples should be representative of the populations from which they come. This assumption poses a special problem for variability because samples consistently tend to be less variable than their populations. STANDARD DEVIATION AND VARIANCE FOR SAMPLES The fact that a sample tends to be less variable than its population means that sample variability gives a biased estimate of population variability. This bias is in the direction of underestimating the population value rather than being right on the mark. STANDARD DEVIATION AND VARIANCE FOR SAMPLES 1. Find the deviation from the mean for each score: deviation X – M 2. Square each deviation: squared deviation (X – M)^2 3. Add the squared deviations: SS = (X – M)^2 STANDARD DEVIATION AND VARIANCE FOR SAMPLES STANDARD DEVIATION AND VARIANCE FOR SAMPLES Remember that the formulas for sample variance and standard deviation were constructed so that the sample variability would provide a good estimate of population variability. For this reason, the sample variance is often called estimated population variance, and the sample standard deviation is called estimated population standard deviation. ○ When you have only a sample to work with, the variance and standard deviation for the sample provide the best possible estimates of the population variability. Degrees of Freedom (df) For a sample of n scores, the degrees of freedom, or df, for the sample variance are defined as df = n – 1. ○ The degrees of freedom determine the number of scores in the sample that are independent and free to vary. The n – 1 degrees of freedom for a sample is the same n – 1 that is used in the formulas for sample variance and standard deviation. Degrees of Freedom (df) To correct for this problem we adjusted the formula for sample variance by dividing by n – 1 instead of dividing by n. The result of the adjustment is that sample variance provides a much more accurate representation of the population variance. ○ Specifically, dividing by n – 1 produces a sample variance that provides an unbiased estimate of the corresponding population variance. STANDARD DEVIATION AND DESCRIPTIVE STATISTICS Standard Deviation as a Descriptive Statistics Standard deviation is primarily a descriptive measure; it describes how variable, or how spread out, the scores are in a distribution. Behavioral scientists must deal with the variability that comes from studying people and animals. ○ People are not all the same; they have different attitudes, opinions, talents, IQs, and personalities. Standard Deviation as a Descriptive Statistics Standard deviation describes variability by measuring distance from the mean. In any distribution, some individuals are close to the mean, and others are relatively far from the mean. ○ Standard deviation provides a measure of the typical, or standard, distance from the mean. Standard Deviation as a Descriptive Statistics Describing an entire distribution When you are given these two descriptive statistics, however, you should be able to visualize the entire set of data. ○ For example, consider a sample with a mean of M = 36 and a standard deviation of s = 4. As a rule of thumb, roughly 70% of the scores in a distribution are located within a distance of one standard deviation from the mean, and almost all of the scores (roughly 95%) are within two standard deviations of the mean. Standard Deviation as a Descriptive Statistics Describing the location of individual scores The scale of measurement helps to complete the picture of the entire distribution and relate each individual score to the rest of the group. ○ In this example, you should realize that a score of X = 34 is located near the center of the distribution, only slightly below the mean. ○ On the other hand, a score of X = 45 is an extremely high score, located far out in the right-hand tail of the distribution. Standard Deviation as a Descriptive Statistics Describing the location of individual scores The general point of this discussion is that the mean and standard deviation are not simply abstract concepts or mathematical equations. Instead, these two values should be concrete and meaningful, especially in the context of a set of scores. ○ The mean and standard deviation are central concepts for most of the statistics that are presented in the following chapters. VARIANCE AND INFERENTIAL STATISTICS In very general terms, the goal of inferential statistics is to detect meaningful and significant patterns in research results. The basic question is whether the patterns observed in the sample data reflect corresponding patterns that exist in the population, or are simply random fluctuations that occur by chance. Variability plays an important role in the inferential process because the variability in the data influences how easy it is to see patterns. VARIANCE AND INFERENTIAL STATISTICS In general, low variability means that existing patterns can be seen clearly, whereas high variability tends to obscure any patterns that might exist. In the context of inferential statistics, the variance that exists in a set of sample data is often classified as error variance. ○ This term is used to indicate that the sample variance represents unexplained and uncontrolled differences between scores. As the error variance increases, it becomes more difficult to see any systematic differences or patterns that might exist in the data. Summary Summary The purpose of variability is to measure and describe the degree to which the scores in a distribution are spread out or clustered together. ○ The range is the distance covered by the set of scores, from the smallest score to the largest score. ○ The variance is the mean of the squared deviations. ○ The standard deviation is the square root of the variance and provides a measure of the standard distance from the mean. Summary Lesson Objectives Discuss the purpose of variability in psychological research Compute and interpret for the different measures for variability.