Podcast
Questions and Answers
Which measure of central tendency is most affected by extreme values (outliers)?
Which measure of central tendency is most affected by extreme values (outliers)?
- Interquartile Range
- Mode
- Mean (correct)
- Median
The median is always equal to the mean in a symmetric distribution.
The median is always equal to the mean in a symmetric distribution.
True (A)
What is the primary use of the mode as a measure of central tendency?
What is the primary use of the mode as a measure of central tendency?
Nominal data
Data are considered to be right-skewed if the mean is ______ than the median.
Data are considered to be right-skewed if the mean is ______ than the median.
Which of the following is LEAST affected by extreme values?
Which of the following is LEAST affected by extreme values?
The range provides information about the spread of a dataset and is calculated by subtracting the smallest value from the largest value.
The range provides information about the spread of a dataset and is calculated by subtracting the smallest value from the largest value.
What is the interquartile range (IQR) used for in statistical analysis?
What is the interquartile range (IQR) used for in statistical analysis?
The interquartile range is the difference between the ______ quartile and the first quartile.
The interquartile range is the difference between the ______ quartile and the first quartile.
Which of the following describes the relationship between variance and standard deviation?
Which of the following describes the relationship between variance and standard deviation?
The variance can be a negative value.
The variance can be a negative value.
Why is standard deviation a commonly used measure of variation?
Why is standard deviation a commonly used measure of variation?
Standard deviation is the square root of the ______.
Standard deviation is the square root of the ______.
What does the coefficient of variation (CV) measure?
What does the coefficient of variation (CV) measure?
A higher coefficient of variation indicates lower variability in the data.
A higher coefficient of variation indicates lower variability in the data.
What is the purpose of calculating the coefficient of variation?
What is the purpose of calculating the coefficient of variation?
The coefficient of variation is calculated by dividing the standard deviation by the ______ multiplying by 100.
The coefficient of variation is calculated by dividing the standard deviation by the ______ multiplying by 100.
Match the measure to its characteristic:
Match the measure to its characteristic:
In a dataset of professor salaries, the mean is $170,571, and the median is $155,000. What does this suggest about the distribution?
In a dataset of professor salaries, the mean is $170,571, and the median is $155,000. What does this suggest about the distribution?
When the mean is less than the median, the data is left-skewed.
When the mean is less than the median, the data is left-skewed.
How is data skewed if the mean is smaller than the median?
How is data skewed if the mean is smaller than the median?
If data is skewed towards the left, the ______ is typically less than the median.
If data is skewed towards the left, the ______ is typically less than the median.
Which type of data level is the mode most useful for?
Which type of data level is the mode most useful for?
The median can only be used with interval and ratio data.
The median can only be used with interval and ratio data.
For what types of data are mean, median, and mode appropriate to use?
For what types of data are mean, median, and mode appropriate to use?
The ______ only uses the center values in its calculation.
The ______ only uses the center values in its calculation.
Match each measure with its corresponding formula.
Match each measure with its corresponding formula.
The following data represent a sample: 22, 13, 10, 16, 23, 13, 11, 13. What is the mean?
The following data represent a sample: 22, 13, 10, 16, 23, 13, 11, 13. What is the mean?
The 1st quartile is the same as the 50th percentile.
The 1st quartile is the same as the 50th percentile.
What percentage of data is greater than the 1st quartile?
What percentage of data is greater than the 1st quartile?
The second quartile is also known as the ______.
The second quartile is also known as the ______.
Given a data set of 19 values, what position would the 60th percentile be found?
Given a data set of 19 values, what position would the 60th percentile be found?
If the given position for the 75th percentile is 7, the 75th percentile is the 7th element in the dataset.
If the given position for the 75th percentile is 7, the 75th percentile is the 7th element in the dataset.
Why does the interquartile range ignore outliers?
Why does the interquartile range ignore outliers?
The measure that eliminates some outlier problems is the ______.
The measure that eliminates some outlier problems is the ______.
Match the type of variability with each condition.
Match the type of variability with each condition.
A new advertising campaign has a follow-up survey, and from 150 individuals contacted, 45 of them could recognize the new advertising slogan. Which of the following is the proportion of the recognition?
A new advertising campaign has a follow-up survey, and from 150 individuals contacted, 45 of them could recognize the new advertising slogan. Which of the following is the proportion of the recognition?
Coefficient variations can be negative.
Coefficient variations can be negative.
How to present Ethical Considerations with Numerical descriptive measures?
How to present Ethical Considerations with Numerical descriptive measures?
A measure of central location is applied to a sample rather than a population is known as a ______.
A measure of central location is applied to a sample rather than a population is known as a ______.
Flashcards
What is the 'Mean'?
What is the 'Mean'?
The arithmetic average of a data set.
What is the 'Median'?
What is the 'Median'?
The center value that divides data into two halves when arranged numerically.
What is the 'Mode'?
What is the 'Mode'?
The value in a data set that occurs most frequently.
What is right-skewed data?
What is right-skewed data?
Data are right skewed if the mean is larger than the median.
Signup and view all the flashcards
What is left-skewed data?
What is left-skewed data?
Data are left skewed if the mean is smaller than the median.
Signup and view all the flashcards
What are 'Quartiles'?
What are 'Quartiles'?
Values that divide a data array into four equal-sized groups.
Signup and view all the flashcards
What is a percentile?
What is a percentile?
The pth percentile in an ordered array of n values. i = (p/100)(n).
Signup and view all the flashcards
What is the 'Range'?
What is the 'Range'?
It is the simplest measure of variation, calculates the difference between the largest and smallest observations.
Signup and view all the flashcards
What is Interquartile Range?
What is Interquartile Range?
Measure that eliminates some outlier problems by using the interquartile range.
Signup and view all the flashcards
What is the 'Variance'?
What is the 'Variance'?
The average of the squared distances of the data values from the mean.
Signup and view all the flashcards
What is the 'Standard Deviation'?
What is the 'Standard Deviation'?
The positive square root of the variance; has the same units as original data.
Signup and view all the flashcards
What is the Coefficient of Variation?
What is the Coefficient of Variation?
A ratio of standard deviation to the mean, expressed as a percentage.
Signup and view all the flashcards
What is a parameter?
What is a parameter?
A summary measure computed to describe a characteristic of the population
Signup and view all the flashcards
What is a statistic?
What is a statistic?
A summary measure computed to describe a characteristic of the sample
Signup and view all the flashcardsStudy Notes
- Chapter 3 focuses on describing data using numerical measures.
- Focus is on computing measures of middle, variability, and using numerical measures to describe data effectively.
Summary Measures
- Data can be described numerically through measures of center and location, other measures of location, and measures of variation.
Measures of Center and Location
- Measures of center and location include the mean, median, and mode, providing an overview of where the data is centered.
Mean (Arithmetic Average)
- The mean can be thought of as the balance point or center of mass of the data.
- To calculate the mean, sum the values and divide by the number of values.
- The population mean (μ) involves summing all values in the population (Σxᵢ) and dividing by the total number of data values (N).
- The sample mean (x̄) sums all values in the sample (Σxᵢ) and divides by the number of values in the sample (n).
- Extreme values (outliers) affect the mean.
- The mean is generally used for interval/ratio data.
- The mean is the most common measure of central tendency.
- The formula for the population mean is μ = (Σx) / N, where N is the number of data values.
- The average occupancy rate is found to be 15.125 rooms per week
Sample Mean Calculation Steps on a Sharp Calculator
- Clear the memory before each question by pressing [2nd F], [ALPHA], [0], then [0].
- Set the calculator to 4 decimal places by pressing [SET UP], [0], [0], then [4].
- Use [MODE] [1] [0] to display "STAT 0."
- Enter data using the [DATA] key {ENT}.
- To get the mean, press [ALPHA] [x-bar] {on the 4 key}, then the = key.
- For a set of professor salaries, the mean salary is $170,571.
- Solved problem 3.3, the mean is found to be 14.63
Median
- Used for ordinal and interval/ratio data
- The median is the center value that divides sorted data into two halves.
- If the number of data points is odd, the median is the middle number; if even, it is the average of the two middle numbers.
- The median is not affected by extreme values.
- A skewed distribution benefits from the median as a better measure of center.
- 50% of professors earn less than $155,000 (median salary).
- The ordered array for professor salaries, with an extreme value included, remains at $155,000
Alternative Calculation Methods
- Method 1: Calculate i = (n+1)/2 to find the median index point.
- If i is an integer, it corresponds to the median's position.
- If i is not an integer, the median is the average of the values in the integer positions below and above i.
- Method 2: Calculate i = (1/2)n as the Median Index Point.
- If i is an integer, the median is the average of the values in positions i and i+1.
- If i is not an integer, round up to the next integer to find the median position.
- A data array is given to be: 4, 4, 5, 5, 9, 11, 12, 14, 16, 19, 22, 23 with n = 12, the median position is calculated.
- The median is the average of the 6th and 7th values: (11+12)/2 = 11.5
- By finding the median in problem 3.3, the value is equal to 14.35.
Mode
- The mode as a measure of central tendency is the value in a data set that occurs most frequently.
- The mode is used for either numerical or nominal (categorical) data, making it useful for nominal data.
- Datasets may lack a mode, or exhibit multiple modes if several values share the highest frequency.
- Extreme values do not affect the mode.
- In Example 3-6 a mode group found to be 2 groups of size 4.
Skewed Data
- In right-skewed data, the mean is larger than the median.
- In left-skewed data, the mean is smaller than the median.
Shape of a Distribution
- How data is distributed is described.
- Distributions can be symmetric or skewed.
- In a left-skewed distribution, the mean is less than the median
- In a symmetric distribution, the mean equals the median.
- In a right-skewed distribution, the median is less than the mean.
Descriptive Measures
- Descriptive measures have been summarized in figure 3-6
- The mean is used for ratio intervals and is sensitive to extremes.
- The median is used for ratio intervals and ordinal values and is not used on all data.
- The mode is used for ratio intervals, ordinal values, and nominal data, however may not reflect the center.
Other Measures of Location
- Percentiles and quartiles are the two other measures of data location.
Percentiles
- To calculate the ith position = p/100 * n
- If it is not an integer, round up to the next integer which corresponds to the pth percentile in the dataset.
- If it is an integer, the pth percentile is the means of the dataset.
- As an example, calculating the 60th percentile of 19 ordered values.
Quartiles
- Quartiles split the ranked data into 4 equal groups.
- The second quartile is typically known to be the 50th percentile.
- A first Quartile example needs to be shown.
- A tutorial question requires arranging values into an ordered array to find the mean (280.54), median (293), and mode (325).
- It requires the identification of a measure of central tendency and calculation of the 1st and 3rd quartiles (271, 317).
- Also, problem 3.7 needs to consider the following
- Determine the median scores
- Determine the 25th and 75th percentiles
- Determine the 60th percentile
Answers Given
- Median: 71.5. i = (20+1)/2 = 10.5 is not an integer. Median is the average of the 10th and 11th values; (70+73)/2=71.5.
- 25th percentile: i=25/100*20=5. i is 5, the 25th percentile average of the 5th and 6th values = (59+65)/2 = 62.
- 75th percentile: i =75/10020=15. As i is 15, the 75th percentile average of the 15th and 16th values = (81+82)/2 = 81.5 60th percentile: i = 60/10020=12. As i is 12, the 60th percentile average of the 12th and 13th values = (73+78)/2 = 75.5
Proportions
- π = the proportion of population having some characteristic
- Formula for population proportion π = (number of occurrences in the population) / (population size)
- Sample proportion (p) provides an estimate of N:
- Formula for sample proportion p = (number of occurrences in the sample) / (sample size)
- A proportion is a special form of the arithmetic average when scoring occurrences with 1.
- Non occurrences use a 0, and proportion p is the arithmetic average of these scores.
Question
- In a telephone follow-up survey of a new advertising campaign, 45 out of 150 individuals contacted could recall the new advertising slogan associated with the product.
- Compute the proportion of people who could recall the new advertising slogan.
- Is this value the population parameter π or the sample statistic p?
Variation
- If all of the data are not the same value, a set of data exhibits variation.
- Measures of variation are measures of spread or variability of the data values.
Measures of Variation
- Measures of Variation include
- Range
- Variance
- Standard Deviation
- Coefficient of Variation
Range
- The simplest variation measure.
- A difference between the largest and smallest observations.
- The simple formula is: Range = Xmaximum - Xminimum. Example range calculation is 14 - 1 = 13
- The way in which data is distributed is simply ignored
- The range can be too sensitive to outcomes.
- The weak measure of variation uses too few values to indicate the variation.
Interquartile Range
- The Interquartile Range:
- Some outlier problems are eliminated.
- High and low-valued observations are eliminated.
- A Interquartile range = 3rd quartile and 1st quartiles
- Or the 75th and 25th percentiles. Example: interquartile range shown = 57 – 30 => 27 Problem 3-1 requests an IQR only
Variance
- The variance is the average of the squared distances of the data values from the mean.
- Sample variance: S2 = Σ(x₁ - x)² / n-1
- where: n = sample size s2 = sample variance Population variance: σ² = Σ(Χ₁ - μ) 2 / N N = population size population variance (sigma squared) For data (Fleetwood Mobile Home):X 15 15 - 25 = 25 25 -25 = 35 35 -25 = 20 20 - 25 = 30 30 - 25 =
Sharp calculator data to find the mean: 15 [Data] 25 [Data] 35 [Data] 20 [Data] 30 [Data]
Mode = 0, STAT Press [Alpha] [4] = Mean is 25 Press [Alpha] [6] = Standard deviation 7.07
Problems
- Data is given for 3-25 Part (c), 3-26 Parts (b and c), 3-29 number of typo errors from pages of a book Compute the standard deviation for these sample data.
- The number of times a population of business execs has been to the previous month; by: Computing variance standard deviation and assuming that data represent a sample instead of population By discussing the difference between values computed.
Answer
- 3-25c= 3.1251 3-26b= var= 3.1389, STD=1.7717 3-26c= var= 3.7667, STD=1.9408 - 3-29
- Range = 28 VAR = 92.75 STD = 9.6307 Q1=23.5, Q3=40.5. IQR = 17.
Comparing Standard Deviations
- Same mean, but different standard deviations.
Coefficient of Variation
- The coefficient of variation is the ratio of the standard deviation to the mean expressed as a percentage.
- The coefficient of variation is used to measure the relative variation in the data.
- The relative data set is that there is quite a big variability in the Lumber data
Formula
- S Population Population CV =
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.