final_03112024_data.csv
Document Details
Uploaded by momogamain
Related
- PCSII Depression/Anxiety/Strong Emotions 2024 Document
- A Concise History of the World: A New World of Connections (1500-1800)
- Human Bio Test PDF
- Vertebrate Pest Management PDF
- Lg 5 International Environmental Laws, Treaties, Protocols, and Conventions
- Educación para la Salud: la Importancia del Concepto PDF
Full Transcript
QuestionType,QuestionText,AnswerA,AnswerB,AnswerC,AnswerD,CorrectAnswers,CorrectAnswerText,CorrectAnswerInfo multiple\_choice,"1. A teacher has the following test scores from a small class: \[72, 75, 78, 85, 88, 90, 92, 95, 98\]. Which visualization would be most useful to display the individual dat...
QuestionType,QuestionText,AnswerA,AnswerB,AnswerC,AnswerD,CorrectAnswers,CorrectAnswerText,CorrectAnswerInfo multiple\_choice,"1. A teacher has the following test scores from a small class: \[72, 75, 78, 85, 88, 90, 92, 95, 98\]. Which visualization would be most useful to display the individual data points effectively?",Histogram,Box Plot,Dot Plot,Pie Chart,C,Dot Plot,"For small datasets, a dot plot is most effective because it displays each individual data point clearly. Histograms and box plots are better suited for larger datasets, while pie charts are for categorical data." multiple\_choice,"2. In the dataset \[45, 47, 50, 52, 55, 58, 60\], what is the median value?",50,52,55,58,B,52,"The median is the middle value when data is ordered. Since there are 7 data points, the 4th value is the median, which is 52." multiple\_choice,"3. A dataset has Q1 = 25 and Q3 = 75. Using the 1.5×IQR rule, what is the upper bound for detecting outliers?",100,112.5,125,150,B,112.5,IQR = Q3 - Q1 = 75 - 25 = 50. Upper bound = Q3 + 1.5×IQR = 75 + 1.5×50 = 75 + 75 = 150. multiple\_choice,"4. Considering the same dataset as above, what is the lower bound for detecting outliers?",0,-50,-75,-112.5,A,0,"Lower bound = Q1 - 1.5×IQR = 25 - 75 = -50. Since data values can't be negative in many contexts (e.g., test scores), the practical lower bound is 0." multiple\_choice,"5. In a box plot, if the median is closer to Q1 than Q3, what does this indicate about the data distribution?",Symmetrical distribution,Left-skewed distribution,Right-skewed distribution,Uniform distribution,C,Right-skewed distribution,"If the median is closer to Q1, it means the lower 50% of data is more concentrated, and the upper 50% is more spread out, indicating a right-skewed distribution." multiple\_choice,"6. A company recorded the daily sales for a week: \[200, 220, 210, 205, 500, 215, 210\]. Which measure of central tendency is most appropriate to represent typical daily sales?",Mean,Median,Mode,Range,B,Median,The median is less affected by the outlier (500) and better represents the typical daily sales in this case. multiple\_choice,"7. For the dataset \[5, 7, 7, 8, 9, 10, 12\], which value represents the mode?",5,7,8,10,B,7,"The mode is the most frequently occurring value. Here, 7 appears twice, more than any other number." multiple\_choice,8. A dataset has a mean of 50 and a standard deviation of 5. Approximately what percentage of data falls within one standard deviation of the mean in a normal distribution?,50%,68%,95%,99.70%,B,0.68,"In a normal distribution, approximately 68% of data falls within ±1 standard deviation of the mean." multiple\_choice,9. When is it more appropriate to use the interquartile range (IQR) over the standard deviation to measure data spread?,When the data is normally distributed,When the data has outliers,When dealing with categorical data,When the mean and median are equal,B,When the data has outliers,"IQR is less affected by outliers and skewed data, making it more appropriate when data contains outliers." multiple\_choice,"10. In finance, why might an analyst prefer using box plots over histograms when comparing the returns of multiple stocks?",Box plots show individual data points,Box plots require less data,Box plots make it easier to compare medians and IQRs across datasets,Box plots are better for categorical data,C,Box plots make it easier to compare medians and IQRs across datasets,"Box plots succinctly display medians, IQRs, and outliers, facilitating direct comparison of data spread and central tendency across multiple datasets." multiple\_choice,"11. A researcher collects the following data on the number of hours students study per week: \[2, 5, 5, 7, 10, 10, 10, 12, 15\]. Which visualization would best display the frequency distribution of study hours?",Pie Chart,Histogram,Scatter Plot,Line Graph,B,Histogram,"A histogram groups data into bins and displays the frequency of data within each bin, effectively showing the distribution of study hours." multiple\_choice,"12. If a dataset has a minimum value of 20, Q1 = 30, median = 40, Q3 = 60, and a maximum value of 100, how would you describe the skewness based on the box plot?",Symmetrical,Left-skewed,Right-skewed,Uniform,C,Right-skewed,The larger distance between Q3 and the maximum compared to Q1 and the minimum suggests a right-skewed distribution. multiple\_choice,"13. In the context of the box plot, what does the length of the box represent?",The range of the data,The interquartile range (IQR),The standard deviation,The mean,B,The interquartile range (IQR),"The length of the box represents the IQR, which is the range between Q1 and Q3, covering the middle 50% of the data." multiple\_choice,"14. A set of data has an IQR of 20. Using the 1.5×IQR rule, any data point below which value would be considered an outlier if Q1 is 40?",10,20,30,40,B,20,Lower bound = Q1 - 1.5×IQR = 40 - 1.5×20 = 40 - 30 = 10. Any data point below 10 is an outlier. multiple\_choice,15. Why might a dot plot be preferred over a histogram when analyzing a dataset of 15 numerical values?,Dot plots are better for large datasets,Dot plots display individual data points clearly,Dot plots group data into bins,Dot plots can represent categorical data,B,Dot plots display individual data points clearly,"Dot plots are ideal for small datasets as they show each individual data point, making it easier to observe exact values and frequencies." multiple\_choice,"1. In a side-by-side box plot comparing the returns of two investments, Investment A has a wider box (higher IQR) than Investment B. What does this tell you about the volatility of Investment A compared to Investment B?",Investment A has lower volatility than Investment B.,Investment A has higher volatility than Investment B.,Investment A and Investment B have the same level of volatility.,Investment A has a higher median return than Investment B.,B,Investment A has higher volatility than Investment B.,"A wider box indicates a higher IQR, meaning the returns are more spread out and thus more volatile. Investment A has greater variability in its returns compared to Investment B." multiple\_choice,"1. If a dataset's box plot has a median line that is closer to the lower quartile (Q1), what does this indicate about the data distribution?",The data is symmetrically distributed.,The data is skewed to the left.,The data is skewed to the right.,The data has no skew.,C,The data is skewed to the right.,"If the median line is closer to Q1, it means that the lower half of the data is more tightly packed than the upper half, indicating that the data is skewed to the right." multiple\_choice,1. What does the median of a dataset tell you?,The most frequently occurring value,The average of all values,The middle value when the data is sorted,The range of the data,C,The middle value when the data is sorted,"The median is the middle value of a sorted dataset, which divides the data into two equal halves." multiple\_choice,"2. In a box plot, if the median is closer to Q3, what does this indicate about the skewness of the data?",The data is right-skewed,The data is left-skewed,The data is normally distributed,The data has no skew,B,The data is left-skewed,"When the median is closer to Q3, the data is left-skewed, meaning the tail is longer on the left side." multiple\_choice,"3. If a box plot shows several outliers far above the upper whisker, what can you infer about the dataset?",The dataset has a strong left skew,The dataset has a strong right skew,The dataset is uniformly distributed,The dataset has no significant variation,B,The dataset has a strong right skew,"Outliers far above the upper whisker suggest that there are extreme high values, indicating a right skew." multiple\_choice,4. A dataset has Q1 = 10 and Q3 = 30. What is the IQR?,10,20,30,40,B,20,"The IQR is calculated as Q3 - Q1, which is 30 - 10 = 20." multiple\_choice,5. What does a small IQR imply about a dataset?,The data is highly volatile,The data is tightly packed and has low variability,The median is skewed,The data has many outliers,B,The data is tightly packed and has low variability,"A small IQR indicates that the middle fifty percent of the data is closely packed, showing low variability." multiple\_choice,"6. In scientific experiments, why might a box plot be used to compare multiple treatment groups?",To visualize the average values only,"To show the spread, median, and identify outliers in each group",To display the exact values for every observation,To calculate the mean and standard deviation,B,"To show the spread, median, and identify outliers in each group","Box plots are useful for comparing the spread, median, and outliers between treatment groups." multiple\_choice,"7. In a weather dataset, the IQR of daily temperatures in July is found to be very wide. What does this suggest?",July temperatures are very consistent,July has a large variation in temperatures,There are no extreme temperature values,The data is normally distributed,B,July has a large variation in temperatures,"A wide IQR indicates a large spread of temperatures, suggesting significant daily variation." multiple\_choice,8. What aspect of a box plot can indicate whether a dataset has outliers?,The width of the box,The length of the whiskers,Dots or points outside the whiskers,The position of the median,C,Dots or points outside the whiskers,Outliers are shown as individual dots or points outside the whiskers on a box plot. multiple\_choice,"9. In data science, why is it important to identify outliers in your data?",They have no impact on statistical analysis,They can skew the results and affect model performance,They always improve model accuracy,They simplify the analysis,B,They can skew the results and affect model performance,"Outliers can significantly affect the mean and skew results, leading to inaccurate models or insights." multiple\_choice,"10. If you are comparing the test scores of two classes and notice that one class has a box plot with a much wider IQR than the other, what does this tell you?",The class with the wider IQR has more consistent scores,The class with the wider IQR has more varied scores,Both classes have the same spread of scores,The class with the wider IQR has a higher median score,B,The class with the wider IQR has more varied scores,A wider IQR indicates more variability in the scores of that class. multiple\_choice,"11. In a biological study, a box plot of plant heights shows that the upper whisker is much longer than the lower whisker. What does this suggest about the data?",The data is left-skewed,The data is right-skewed,The data is uniformly distributed,The data has no skew,B,The data is right-skewed,"A longer upper whisker indicates a right-skewed distribution, with some plants being much taller than the median." multiple\_choice,12. What does it mean if a box plot has no outliers?,The data has no variation,All data points fall within 1.5 times the IQR from Q1 and Q3,The data is perfectly symmetrical,The mean and median are equal,B,All data points fall within 1.5 times the IQR from Q1 and Q3,"If there are no outliers, it means all data points fall within 1.5 times the IQR from Q1 and Q3." multiple\_choice,"13. When analyzing sales data, a box plot reveals that the sales values have many outliers on the high end. What could be a possible explanation?",Sales are very consistent,There were some exceptionally high sales figures,All sales values are below the median,The sales data is normally distributed,B,There were some exceptionally high sales figures,Outliers on the high end suggest that there were some very high sales figures compared to the rest of the data. multiple\_choice,14. Which statement is true about a dataset if the median and mean are significantly different?,The data is uniformly distributed,The data likely has skewness,The data is perfectly symmetrical,The data has no outliers,B,The data likely has skewness,"If the median and mean differ significantly, it indicates the presence of skewness in the data." multiple\_choice,15. Why might you choose to use a box plot over a histogram in data analysis?,Box plots show the distribution of individual data points,Box plots are better at displaying categorical data,"Box plots summarize the data using quartiles and outliers, making them suitable for comparing groups",Box plots are used to calculate averages,C,"Box plots summarize the data using quartiles and outliers, making them suitable for comparing groups","Box plots provide a concise summary of the data, showing the median, quartiles, and outliers, which is useful for comparing distributions between groups." multiple\_choice,"A company has a dataset of employee ages: \[23, 24, 25, 26, 27, 50\]. The mean age is 29.2. Which measure of central tendency would best represent the typical employee age?",Mean,Median,Mode,Range,B,Median,The median is more representative of the typical age since the mean is skewed by the outlier (age 50). multiple\_choice,"If a box plot shows the median closer to Q1 and a longer whisker extending toward Q3, what does this suggest about the data distribution?",The data is left-skewed,The data is right-skewed,The data is normally distributed,The data is bimodal,B,The data is right-skewed,A longer whisker on the upper side indicates a right-skewed distribution. multiple\_choice,"When comparing two datasets, you notice that one has a much larger standard deviation than the other. What does this imply?",The data points in both datasets are similarly distributed,The dataset with a larger standard deviation has more variability,Both datasets have the same mean,The dataset with a larger standard deviation has fewer outliers,B,The dataset with a larger standard deviation has more variability,A larger standard deviation indicates more variability in the data. multiple\_choice,Which of the following is true about the IQR as a measure of spread?,It is affected by extreme outliers,It measures the spread of the entire dataset,It is the difference between the maximum and minimum values,It focuses on the spread of the middle 50% of the data,D,It focuses on the spread of the middle 50% of the data,The IQR measures the spread of the middle 50% of the data and is not affected by extreme outliers. multiple\_choice,"You are analyzing monthly expenses for a year, and the IQR is \$500. What does this imply about the middle 50% of the monthly expenses?","The expenses vary by more than \$1,000",The expenses are tightly clustered,The middle 50% of expenses have a range of \$500,There are no outliers in the expenses,C,The middle 50% of expenses have a range of \$500,An IQR of \$500 indicates that the middle 50% of the monthly expenses have a spread of \$500. multiple\_choice,"In a box plot, what does it mean if the median is closer to Q3 than to Q1?",The data is right-skewed,The data is left-skewed,The data is symmetrically distributed,The data has no outliers,B,The data is left-skewed,"If the median is closer to Q3, the data is left-skewed, meaning there are more higher values." multiple\_choice,A dataset has a standard deviation of 0. What does this indicate about the data?,All data points are the same,The data points are widely spread out,The data has multiple outliers,The mean and median are different,A,All data points are the same,A standard deviation of 0 indicates that all data points are identical. multiple\_choice,Why might the range not be the best measure of spread in a dataset with outliers?,The range does not consider all data points,The range is not affected by extreme values,The range can be heavily influenced by outliers,The range only applies to normally distributed data,C,The range can be heavily influenced by outliers,"The range can be misleading in the presence of outliers, as it only considers the maximum and minimum values." multiple\_choice,What does it imply if a scatter plot shows no discernible pattern between two variables?,The variables have a strong linear relationship,The variables have a weak or no relationship,The variables are highly correlated,The variables have the same mean,B,The variables have a weak or no relationship,No pattern in a scatter plot indicates that the variables are likely unrelated or have a weak relationship. multiple\_choice,You are given a dataset with a mean of 75 and a median of 90. What can you infer about the distribution of the data?,The data is symmetrically distributed,The data is left-skewed,The data is right-skewed,The data has no skew,B,The data is left-skewed,"If the mean is less than the median, the data is left-skewed, with more values on the higher end." multiple\_choice,When would it be most appropriate to use the range as a measure of spread?,When the data has many outliers,When the data is uniformly distributed,When you need a quick measure of the total spread,When the data has a strong skew,C,When you need a quick measure of the total spread,The range provides a simple and quick measure of the total spread but can be misleading if there are outliers. multiple\_choice,Which scenario would most likely produce a right-skewed distribution?,Household incomes in a wealthy area,Heights of adult males,Weights of newborn babies,Exam scores in a class where most students did well,A,Household incomes in a wealthy area,Household incomes tend to be right-skewed because a few very high incomes pull the distribution to the right. multiple\_choice,"In a dataset with a mean of 100 and a standard deviation of 10, which data point would be considered an outlier using the rule of thumb that considers values more than 3 standard deviations from the mean?",110,130,70,120,B,130,130 is 3 standard deviations (10 × 3 = 30) above the mean (100 + 30 = 130). multiple\_choice,"If a dataset has an IQR of 25, what is the significance of a data point that lies 50 units above Q3?",It is within the range of the IQR,It is likely an outlier,It is the maximum value,It is the median value,B,It is likely an outlier,"A data point 50 units above Q3 is twice the IQR, which makes it a strong candidate for an outlier." multiple\_choice,"A data analyst finds that the median house price in a city is \$350,000, but the mean is \$500,000. What does this suggest about the distribution of house prices?",The distribution is left-skewed,The distribution is right-skewed,The distribution is uniform,The distribution is normal,B,The distribution is right-skewed,"A mean higher than the median indicates a right-skewed distribution, likely due to a few very high house prices." multiple\_choice,Why would you use a scatter plot when analyzing the relationship between two variables?,To summarize the spread of one variable,To identify the median and quartiles,To visually identify relationships or patterns,To calculate the range of both variables,C,To visually identify relationships or patterns,"A scatter plot helps visually identify potential relationships, patterns, or correlations between two variables." multiple\_choice,A dataset has a mean of 70 and a median of 80. What does this imply about the skewness of the data?,The data is left-skewed,The data is right-skewed,The data is uniformly distributed,The data has no skew,A,The data is left-skewed,"If the mean is less than the median, the data is left-skewed." multiple\_choice,Which of the following statements is true about a dataset that is perfectly symmetrical?,"The mean, median, and mode are all the same",The mean is greater than the median,The standard deviation is always zero,The data must have no outliers,A,"The mean, median, and mode are all the same","In a perfectly symmetrical distribution, the mean, median, and mode are all equal." multiple\_choice,A box plot of sales data shows that the lower whisker is much longer than the upper whisker. What does this suggest about the sales data?,The sales data is right-skewed,The sales data is left-skewed,The sales data is uniformly distributed,The sales data has no outliers,B,The sales data is left-skewed,"A longer lower whisker suggests that the data is left-skewed, with more values on the higher end." multiple\_choice,When would the IQR be preferred over the standard deviation as a measure of spread?,When the data is normally distributed,When the data has no outliers,When the data has significant outliers or is skewed,When comparing categorical data,C,When the data has significant outliers or is skewed,"The IQR is preferred when there are outliers or skewness, as it is not affected by extreme values." multiple\_choice,"If a dataset's whiskers in a box plot are of equal length, what does this suggest about the distribution of the data?",The data is normally distributed,The data is skewed,The data is bimodal,The data is heavily skewed to the left,A,The data is normally distributed,"Equal whisker lengths suggest that the data is approximately symmetrical, often indicating a normal distribution." multiple\_choice,"When analyzing a right-skewed distribution, which of the following measures will be most affected by the skew?",Median,Mode,Mean,IQR,C,Mean,The mean is more affected by extreme values on the right and will be greater than the median in a right-skewed distribution. multiple\_choice,You are analyzing two datasets with the same mean but different standard deviations. What does this tell you about the datasets?,The datasets have the same variability,One dataset has more variability than the other,The datasets have the same shape,Both datasets are skewed in the same way,B,One dataset has more variability than the other,"Different standard deviations mean the datasets have different amounts of variability, even if the means are the same." multiple\_choice,"A data scientist is analyzing the heights of trees in a forest. Most trees are between 5 and 10 meters, but there are a few that are over 20 meters tall. Which measure of central tendency should they report?",Mean,Median,Mode,Range,B,Median,The median is a better measure because it is not affected by the few extremely tall trees. multiple\_choice,"If a dataset has an IQR of 30 and Q1 is 40, what is the upper boundary for identifying outliers?",85,95,70,55,B,95,"The upper boundary is Q3 + 1.5 × IQR. First, calculate Q3: Q3 = Q1 + IQR = 40 + 30 = 70. Then, 70 + 1.5 × 30 = 95." multiple\_choice,A distribution of exam scores is left-skewed. Which statement about the mean and median is most likely true?,The mean is greater than the median,The mean is less than the median,The mean is equal to the median,The mean and mode are equal,B,The mean is less than the median,"In a left-skewed distribution, the mean is pulled to the left (lower values) and is less than the median." multiple\_choice,What does a high standard deviation in a dataset imply about the spread of values?,The values are closely packed around the mean,The values are widely spread out around the mean,There are no outliers in the data,The data is perfectly symmetrical,B,The values are widely spread out around the mean,A high standard deviation indicates that data points are spread out widely from the mean. multiple\_choice,"When analyzing a dataset of monthly sales, you find several extreme values. What should you do first before making any decisions about these outliers?",Immediately remove the outliers from the dataset,Investigate the outliers to understand why they occurred,Assume the outliers are errors,Replace the outliers with the mean value,B,Investigate the outliers to understand why they occurred,You should investigate outliers to determine if they are data errors or meaningful events. multiple\_choice,A dataset of house prices is highly variable. Which of the following measures is most appropriate for understanding the overall spread of house prices?,Mean,Range,Median,Standard Deviation,D,Standard Deviation,"Standard deviation is useful for understanding the overall spread, especially when the data has high variability." multiple\_choice,"If you are using the IQR method to detect outliers and you have Q1 = 30 and Q3 = 80, what is the lower boundary for identifying outliers?",10,5,15,0,C,15,"The lower boundary is Q1 - 1.5 × IQR. First, calculate IQR: 80 - 30 = 50. Then, 30 - 1.5 × 50 = 15." multiple\_choice,What does a box plot reveal about a dataset?,The average value of the dataset,"The distribution and spread, including the presence of outliers",The exact frequency of each data point,The correlation between two variables,B,"The distribution and spread, including the presence of outliers","A box plot shows the distribution, spread, and identifies outliers in the data." multiple\_choice,A company's annual revenue data has an IQR of \$10 million and several outliers on the high end. Which measure of central tendency would be most appropriate to report?,Mean,Median,Mode,Range,B,Median,The median is more appropriate in the presence of high outliers because it is not affected by extreme values. multiple\_choice,What does a positive skew in a dataset indicate about the data distribution?,The data is uniformly distributed,The tail is on the left side of the distribution,The data has a long tail on the right side,The mean and median are equal,C,The data has a long tail on the right side,"A positive skew means the data has a long tail on the right side, with the mean being greater than the median." multiple\_choice,You calculate the range of a dataset to be 45. What does this tell you about the data?,The middle 50% of the data is spread across 45 units,The difference between the maximum and minimum values is 45,The standard deviation is 45,The mean is 45,B,The difference between the maximum and minimum values is 45,The range is the difference between the maximum and minimum values. multiple\_choice,A data analyst uses a scatter plot and notices a strong positive correlation between advertising budget and sales. What should be the next step in their analysis?,Assume a causal relationship between budget and sales,Test for causation using additional statistical methods,Reduce the advertising budget to test the relationship,Ignore the correlation,B,Test for causation using additional statistical methods,Correlation does not imply causation. Further analysis is needed to establish a causal relationship. multiple\_choice,"In a dataset with a symmetric distribution, which measures of central tendency and spread are most appropriate to use?",Median and IQR,Mean and Standard Deviation,Mode and Range,Median and Range,B,Mean and Standard Deviation,"For a symmetric distribution, the mean and standard deviation are appropriate measures of central tendency and spread." multiple\_choice,When would you use the median over the mean to describe a dataset?,When the data is normally distributed,When the data has extreme outliers,When there are no outliers,When the data has a small range,B,When the data has extreme outliers,"The median is used when there are extreme outliers because it is not affected by them, unlike the mean." multiple\_choice,"If you want to determine the consistency of test scores, which measure should you use?",Range,Mode,Standard Deviation,Median,C,Standard Deviation,Standard deviation measures the consistency or variability of data around the mean. multiple\_choice,Which of the following scenarios would be best suited for using the IQR as a measure of spread?,A dataset with no outliers and a normal distribution,A dataset with several extreme values,A dataset where all values are similar,A categorical dataset,B,A dataset with several extreme values,"The IQR is best used when the dataset has extreme values, as it focuses on the middle 50% of the data." multiple\_choice,What can be inferred if a scatter plot shows no clear pattern between two variables?,There is a strong correlation between the variables,There is no relationship between the variables,The variables are dependent on each other,The mean of both variables is equal,B,There is no relationship between the variables,No clear pattern in a scatter plot indicates there is no relationship or correlation between the two variables. multiple\_choice,"A dataset of test scores is heavily skewed to the right, with a few very high scores. Which measure of central tendency is most appropriate to describe the average performance of the class?",Mean,Median,Mode,Range,B,Median,The median is less affected by extreme values or skewness and better represents the central tendency of the data. multiple\_choice,"You have a dataset with the following five numbers: \[10, 12, 14, 18, 100\]. Which value would most likely be considered an outlier using the IQR method?",10,12,18,100,D,100,The value 100 is far from the rest of the data. The IQR method would likely flag this as an outlier. multiple\_choice,"In a dataset where most values are clustered around a central point but there are a few extreme outliers, which measure of spread should you use?",Range,Standard Deviation,Interquartile Range (IQR),Mean Absolute Deviation,C,Interquartile Range (IQR),The IQR is not affected by outliers and gives a better representation of spread when the data has extreme values. multiple\_choice,"A real estate analyst is comparing house prices in two neighborhoods. Neighborhood A has a median price of \$200,000 and an IQR of \$50,000, while Neighborhood B has a median price of \$300,000 and an IQR of \$100,000. What can you infer about the variability in house prices?",House prices in Neighborhood A are more spread out.,House prices in Neighborhood B are more spread out.,House prices have the same variability in both neighborhoods.,House prices are skewed in both neighborhoods.,B,House prices in Neighborhood B are more spread out.,"Neighborhood B has a larger IQR, indicating more variability in house prices." multiple\_choice,"When using a box plot to compare the performance of three investment portfolios, what would a longer box in one portfolio indicate compared to the others?",The portfolio has higher average returns.,The portfolio has a wider spread in returns.,The portfolio has fewer outliers.,The portfolio has a skewed distribution.,B,The portfolio has a wider spread in returns.,"A longer box indicates a wider spread, meaning more variability in returns for that portfolio." multiple\_choice,"A set of data has a mean of 50 and a standard deviation of 5. If a data point is 70, how many standard deviations away from the mean is it?",2,3,4,5,B,3,The data point is (70 - 50) / 5 = 4 standard deviations from the mean. multiple\_choice,Why would you choose the median over the mean to describe a dataset of employee salaries at a company?,Because the median considers every salary equally.,Because the median is less affected by extremely high or low salaries.,Because the median is the arithmetic average.,Because the median shows the total sum of all salaries.,B,Because the median is less affected by extremely high or low salaries.,"The median is less influenced by outliers, making it a better measure when salaries have extreme values." multiple\_choice,"If the whiskers of a box plot are very unequal in length, what does this indicate about the data distribution?",The data is normally distributed.,The data has no outliers.,The data is skewed.,The data is uniformly distributed.,C,The data is skewed.,"Unequal whisker lengths indicate that the data is skewed, either to the right or left." multiple\_choice,"In a financial report, a company's daily stock returns are analyzed. Most returns are between -1% and +1%, but there are a few days with returns of -10% and +15%. Which measure of spread would best summarize the variability?",Range,Standard Deviation,IQR,Mean,C,IQR,"The IQR would provide a more robust measure of variability, as it is less affected by the extreme returns." multiple\_choice,"A scatter plot shows a clear upward trend between years of experience and salary. However, there are a few data points where salaries are much lower than expected given the experience. What should you do next?",Ignore the low salaries.,Investigate these outliers to understand if there are special circumstances.,Assume there is no relationship between experience and salary.,Replace these salaries with the mean value.,B,Investigate these outliers to understand if there are special circumstances.,Investigating outliers helps understand if they are due to errors or have meaningful explanations. multiple\_choice,A dataset is normally distributed with a mean of 100 and a standard deviation of 15. What percentage of data falls within one standard deviation of the mean?,50%,68%,95%,99.70%,B,0.68,"In a normal distribution, approximately 68% of the data lies within one standard deviation of the mean." multiple\_choice,"If you have a dataset with extreme outliers, what effect do these outliers have on the mean compared to the median?",The mean is more affected than the median.,The mean is less affected than the median.,Both the mean and median are equally affected.,The outliers have no effect on either the mean or the median.,A,The mean is more affected than the median.,"The mean is more sensitive to extreme values, while the median remains relatively stable." multiple\_choice,You are analyzing income data for a large city and notice a right-skewed distribution. What does this imply about the mean and median?,The mean is less than the median.,The mean is equal to the median.,The mean is greater than the median.,The mean and median cannot be compared.,C,The mean is greater than the median.,"In a right-skewed distribution, the mean is pulled in the direction of the skew, making it larger than the median." multiple\_choice,"When analyzing a dataset, you find that the IQR is 20 and the mean is 100. If a value is 200, is this an outlier based on the IQR method?",Yes,No,Cannot determine without more information,It depends on the range,C,Cannot determine without more information,You need Q1 and Q3 to calculate the exact outlier boundaries using the IQR method. multiple\_choice,A box plot of monthly sales shows several outliers at the high end. What might this suggest about the company's sales strategy or performance?,Consistent and predictable sales,A few months had significantly higher sales than usual,Sales are declining overall,No unusual sales activity,B,A few months had significantly higher sales than usual,"High outliers suggest some months had unusually high sales, which could be due to special promotions or market trends." multiple\_choice,"You are comparing two datasets using box plots. If one box plot has a much larger IQR than the other, what does this imply?",The data points in the first dataset are more concentrated.,The data points in the first dataset are more spread out.,The medians of both datasets are equal.,Both datasets have the same variability.,B,The data points in the first dataset are more spread out.,A larger IQR indicates that the data points are more spread out. multiple\_choice,What does it mean if a dataset has a negative skew?,Most data points are on the higher end with a few low outliers.,Most data points are on the lower end with a few high outliers.,The data is perfectly symmetrical.,The mean and median are equal.,A,Most data points are on the higher end with a few low outliers.,"A negative skew indicates a long tail on the lower end, with most data points being higher." multiple\_choice,"A data analyst uses the IQR method to identify outliers. If the lower boundary is -5 and the upper boundary is 20, which of the following values is an outlier?",0,15,25,10,C,25,The value 25 is above the upper boundary of 20 and would be considered an outlier. multiple\_choice,Why might you choose a scatter plot over a box plot when analyzing a dataset with two continuous variables?,To compare the spread of a single variable,To identify relationships or correlations between two variables,To display the median and quartiles,To identify outliers within one variable,B,To identify relationships or correlations between two variables,A scatter plot shows how two continuous variables are related and can reveal trends or correlations. multiple\_choice,"When examining a box plot, what does a line in the middle of the box represent?",The mean of the dataset,The mode of the dataset,The median of the dataset,The IQR of the dataset,C,The median of the dataset,"The line in the middle of the box represents the median, which divides the dataset into two equal parts."