Full Transcript

VARIANCE & STANDARD DEVIATION Variance The term variance refers to a statistical measurement of the spread between numbers in a data set. More specifically, variance measures how far each number in the set is from the mean (average), and thus from every other number in the set. Variance is oft...

VARIANCE & STANDARD DEVIATION Variance The term variance refers to a statistical measurement of the spread between numbers in a data set. More specifically, variance measures how far each number in the set is from the mean (average), and thus from every other number in the set. Variance is often depicted by this symbol: σ2 In statistics, variance measures variability from the average or mean. It is calculated by taking the differences between each number in the data set and the mean, then squaring the differences to make them positive, and finally dividing the sum of the squares by the number of values in the data set. The more the value of variance, the data is more scattered from its mean and if the value of variance is low or minimum, then it is less scattered from mean. Therefore, it is called a measure of spread of data from mean. A variance of zero indicates that all the values are identical. It should be noted that variance is always non-negative- a small variance Variance is calculated by using the following formula: - In statistics, the variance is used to understand how different numbers correlate to each other within a data set, instead of using more comprehensive mathematical methods such as organising numbers of the data set into quartiles. - Variance considers all the deviations from the mean are the same despite their direction. However, the squared deviations cannot sum to zero and provide the presence of no variability at all in the given data set. - One of the disadvantages of finding variance is that it gives combined weight to extreme values, i.e. the numbers that are far from the mean. When squaring these numbers, there is a chance that they may skew the given data set. - Another disadvantage of variance is that sometimes it may conclude complex calculations. Example Find the variance of the numbers 3, 8, 6, 10, 12, 9, 11, 10, 12, 7. Step 1: Compute the mean of the 10 values given. Mean = (3+8+6+10+12+9+11+10+12+7) / 10 = 88 / 10 = 8.8 Step 2: Make a table with three columns, one for the X values, the second for the deviations and the third for squared deviations. As the data is not given as sample data so we use the formula for population variance. Thus, the mean is denoted by μ. Value X–μ (X – μ)2 X 3 -5.8 33.64 Step 3: 8 -0.8 0.64 6 -2.8 7.84 10 1.2 1.44 12 3.2 10.24 9 0.2 0.04 11 2.2 4.84 = 73.6 / 10 10 1.2 1.44 12 3.2 10.24 = 7.36 7 -1.8 3.24 Total 0 73.6 Standard Deviation Standard Deviation is defined as the degree of dispersion of the data point to the mean value of the data point. Standard deviation measures the spread of a data distribution. It measures the typical distance between each data point and the mean. Definition Standard deviation is a measure used in statistics to understand how the data points in a set are spread out from the mean value. It indicates the extent of the data’s variation and shows how far individual data points deviate from the average. Standard Deviation Formula Standard deviation is used to measure the spread of the statistical data. It tells us about how the statistical data is spread out. The formula we use for standard deviation depends on whether the data is being considered a population of its own, or the data is a sample representing a larger population. There are two standard deviation formulas that are used to find the Standard Deviation of any given data set. - Population Standard Deviation Formula - Sample Standard Deviation Formula If the data is being considered a population on its own, we divide by the number of data points, N If the data is a sample from a larger population, we divide by one fewer than the number of data points in the sample, n - 1 Steps for calculating the standard deviation There are six main steps for finding the standard deviation. Data set 46, 69, 32, 60, 52, 41 Step 1: Find the mean To find the mean, add up all the scores, then divide them by the number of scores. Mean (x̅ ) = (46 + 69 + 32 + 60 + 52 + 41) /6 = 50 Step 2: Find each score’s deviation from the mean Subtract the mean from each score to get the deviations from the mean. Since x̅ = 50, here we take away 50 from each score. Score Deviation from the mean 46 46 – 50 = -4 69 69 – 50 = 19 32 32 – 50 = -18 60 60 – 50 = 10 52 52 – 50 = 2 41 41 – 50 = -9 Step 3: Square each deviation from the mean Multiply each deviation from the mean by itself. This will result in positive numbers. Squared deviations from the mean (-4)2 = 4 × 4 = 16 (19)2 = 19 × 19 = 361 (-18)2 = -18 × -18 = 324 (10)2 = 10 × 10 = 100 (2)2 =2×2=4 (-9)2 = -9 × -9 = 81 Step 4: Find the sum of squares Add up all of the squared deviations. This is called the sum of squares. Sum of squares 16 + 361 + 324 + 100 + 4 + 81 = 886 Step 5: Find the variance Divide the sum of the squares by n – 1 (for a sample) or N (for a population) – this is the variance. Since we’re working with a sample size of 6, we will use n – 1, where n = 6. Variance = 886/(6 - 1) = 886/5 = 177.2 Step 6: Find the square root of the variance To find the standard deviation, we take the square root of the variance. Standard deviation √177.2 = 13.31 From learning that SD = 13.31, we can say that each score deviates from the mean by 13.31 points on average. Mean Deviation and Standard Deviation Mean deviation and standard deviation are measures of central tendency that are highly used for finding the various measures of the data set. The basic difference between the mean deviation and the standard deviation: Mean Deviation Standard Deviation All the central points (mean, Mean is only used to find the median and mode) are used to standard deviation. find the mean deviation. Absolute value of the deviation is Square of the deviation is used to used to find the mean deviation. find the standard deviation. Standard Deviation is a highly Mean deviation is a less used data measure that is used frequently used data measure. to find various central measures.

Use Quizgecko on...
Browser
Browser