new_questions.csv
Document Details
Uploaded by momogamain
Full Transcript
question,answers,correct\_answer,answer\_info "A data analyst is reviewing the annual sales figures for a company. Most sales values range between \$50,000 and \$150,000, but there are a few entries showing sales over \$500,000. Which visualization would best help the analyst identify these high sal...
question,answers,correct\_answer,answer\_info "A data analyst is reviewing the annual sales figures for a company. Most sales values range between \$50,000 and \$150,000, but there are a few entries showing sales over \$500,000. Which visualization would best help the analyst identify these high sales outliers?","{'A': 'Histogram', 'B': 'Box Plot', 'C': 'Stem Plot', 'D': 'Scatter Plot'}",B,"A Box Plot is ideal for identifying outliers as it clearly shows values outside the whiskers. Histograms and stem plots are less effective for spotting outliers, and scatter plots are more suited for showing relationships between two variables." "You are analyzing the loan data for a bank, which includes interest rates and loan amounts (notionals). The interest rates range from 3% to 12%, and loan amounts range from \$5,000 to \$500,000. You notice that most loans have interest rates between 4% and 8%, but there are a few loans with rates above 10%. What should be your first step to understand the distribution and identify any outliers in the interest rates?","{'A': 'Calculate the mean and standard deviation of the interest rates.', 'B': 'Create a box plot of the interest rates.', 'C': 'Plot a scatter plot of interest rates against loan amounts.', 'D': 'Generate a stem-and-leaf plot for the interest rates.'}",B,"Creating a box plot is the best first step as it visually summarizes the distribution of interest rates, highlights the median, quartiles, and easily identifies outliers. While calculating mean and standard deviation or using a stem plot can provide additional insights, a box plot offers a clear and immediate visual representation of the data's spread and any anomalies." "A quality control manager is monitoring the diameter of bolts produced by a machine. She collects a sample of 100 bolts and notices that most diameters are between 5.0 mm and 5.5 mm, but a few measurements are outside this range. She wants to assess the variability and identify any defective bolts. What should she do first?","{'A': 'Calculate the range of the diameters.', 'B': 'Create a histogram of the diameters.', 'C': 'Create a box plot of the diameters.', 'D': 'Plot a scatter plot of diameter vs. production time.'}",C,"Creating a box plot will allow the manager to visualize the distribution of bolt diameters, understand the spread using the IQR, and easily identify any outliers (defective bolts) that fall outside the whiskers." You have a dataset of monthly electricity consumption (in kWh) for 12 households. The values range from 300 to 1500 kWh. You want to understand the central tendency and variability without being affected by extreme values. Which measure and visualization should you use?,"{'A': 'Mean and Histogram', 'B': 'Median and Box Plot', 'C': 'Mode and Stem Plot', 'D': 'Mean and Box Plot'}",B,"Using the Median provides a central tendency measure that is not skewed by outliers. A Box Plot visually summarizes the distribution and highlights any extreme values, making it the best choice for understanding variability without the influence of outliers." A researcher is comparing the test scores of two different teaching methods. Each method has 30 students. Which visualization would best allow the researcher to compare the distributions and identify any differences or outliers between the two groups?,"{'A': 'Side-by-Side Box Plots', 'B': 'Single Histogram', 'C': 'Scatter Plot', 'D': 'Stem Plot'}",A,"Side-by-Side Box Plots allow for an easy comparison of the distributions, medians, and outliers between the two teaching methods. Histograms would require separate plots and are less effective for direct comparison. Scatter plots are for relationships between variables, and stem plots are better for smaller datasets." "In analyzing the daily stock prices of a company, you observe that most prices lie between \$50 and \$150, but there are a few days where the price drops below \$30 or rises above \$200. What is the most appropriate method to visualize these price variations and identify outliers?","{'A': 'Histogram', 'B': 'Box Plot', 'C': 'Line Chart', 'D': 'Bar Chart'}",B,"A Box Plot effectively summarizes the distribution of stock prices and highlights outliers as points beyond the whiskers. While histograms show distribution, they are less effective in pinpointing specific outliers. Line and bar charts are not ideal for this purpose." A data scientist is working with a small dataset of 10 employee ages in a company. She wants to display each individual age while also understanding the overall distribution. Which visualization should she choose?,"{'A': 'Histogram', 'B': 'Box Plot', 'C': 'Stem-and-Leaf Plot', 'D': 'Scatter Plot'}",C,"A Stem-and-Leaf Plot is perfect for small datasets as it displays individual data points while also showing the distribution. Histograms and box plots are better for larger datasets, and scatter plots are for relationships between variables." "You have a dataset of the weights of 200 packages shipped by a company. The weights range from 1 kg to 50 kg, with most packages between 5 kg and 20 kg. To understand the spread and identify any unusually heavy or light packages, which visualization should you use?","{'A': 'Scatter Plot', 'B': 'Box Plot', 'C': 'Pie Chart', 'D': 'Stem Plot'}",B,"A Box Plot is ideal for visualizing the spread of the data and identifying outliers in the weights of the packages. Scatter plots are for relationships between variables, pie charts are not suitable, and stem plots are better for smaller datasets." A teacher wants to compare the test scores of her class against the school average. She has her class's 25 test scores and the school's overall 300 test scores. Which visualization would best help her compare her class's performance to the school's distribution?,"{'A': 'Overlayed Histograms', 'B': 'Side-by-Side Box Plots', 'C': 'Scatter Plot', 'D': 'Stem Plot for each group'}",B,"Side-by-Side Box Plots allow for a direct comparison of the two distributions, showing medians, quartiles, and any outliers for both the class and the school. Overlayed histograms can be cluttered with different dataset sizes, scatter plots are for relationships, and stem plots are not suitable for this comparison." An environmental scientist is studying the annual rainfall in two different regions over the past 50 years. She wants to compare the variability and central tendency of rainfall between the two regions. Which visualization should she use?,"{'A': 'Histogram for each region', 'B': 'Scatter Plot comparing both regions', 'C': 'Side-by-Side Box Plots for both regions', 'D': 'Pie Charts for both regions'}",C,"Side-by-Side Box Plots are ideal for comparing the central tendency and variability between two groups, as well as identifying any outliers in each region. Histograms would require separate plots and are less effective for direct comparison. Scatter plots are for relationships between variables, and pie charts are not suitable for this analysis." "A finance analyst is examining the quarterly returns of 100 different stocks. Most returns fall between -5% and +5%, but a few stocks have returns exceeding +20% or dropping below -15%. Which visualization should the analyst use to summarize the distribution and identify outliers?","{'A': 'Histogram', 'B': 'Box Plot', 'C': 'Line Chart', 'D': 'Bar Chart'}",B,"A Box Plot provides a summary of the distribution, highlighting the median, quartiles, and outliers effectively. Histograms show distribution but are less effective at identifying specific outliers. Line and bar charts are not suitable for this analysis." "A health researcher is analyzing the blood pressure readings of 60 patients. The readings range from 80 mmHg to 200 mmHg, with most values between 90 mmHg and 140 mmHg. She wants to determine if there are any extreme cases of high or low blood pressure. What should she do first?","{'A': 'Calculate the mean and standard deviation of the readings.', 'B': 'Create a box plot of the blood pressure readings.', 'C': 'Plot a scatter plot of blood pressure against age.', 'D': 'Generate a stem-and-leaf plot for the readings.'}",B,"Creating a Box Plot will allow the researcher to visualize the distribution of blood pressure readings, easily identifying any outliers (extreme cases) beyond the whiskers." "A project leader is assessing the time taken by team members to complete various tasks. The completion times vary widely, and some tasks took significantly longer than others. To understand the overall distribution and identify any tasks that took unusually long, which visualization should she use?","{'A': 'Histogram', 'B': 'Box Plot', 'C': 'Scatter Plot', 'D': 'Stem Plot'}",B,"A Box Plot will provide a summary of the completion times, showing the median, quartiles, and any outliers, making it easy to identify tasks that took unusually long." "A university researcher is analyzing the distribution of GPA scores among students in different majors. With GPAs ranging from 2.0 to 4.0, she wants to compare the central tendency and variability across five majors. Which visualization should she use?","{'A': 'Side-by-Side Box Plots', 'B': 'Multiple Scatter Plots', 'C': 'Five Separate Histograms', 'D': 'Pie Charts for each major'}",A,"Side-by-Side Box Plots are ideal for comparing the central tendency and variability of GPA scores across multiple groups (majors), allowing the researcher to easily identify differences and outliers." "A data analyst is reviewing the annual salaries of employees in different departments of a company. Most salaries range between \$40,000 and \$100,000, but there are a few salaries exceeding \$200,000. She wants to understand the spread and identify any exceptionally high salaries. What should she do first?","{'A': 'Calculate the mean salary for each department.', 'B':""Create a box plot for each department's salaries."", 'C': 'Plot a scatter plot of salary vs. years of experience.', 'D': 'Generate separate stem plots for each department.'}",B,"Creating a Box Plot for each department's salaries will allow the analyst to visualize the distribution, understand the spread using the IQR, and easily identify any outliers (exceptionally high salaries) across departments." "An economist is studying the income distribution of households in a city. She has data on annual incomes ranging from \$20,000 to \$1,000,000. To understand the spread and identify any exceptionally high or low incomes, which measure and visualization should she prioritize?","{'A': 'Mean income and Histogram', 'B': 'Median income and Box Plot', 'C': 'Mode income and Stem Plot', 'D': 'Range and Scatter Plot'}",B,"Using the Median income provides a central tendency measure unaffected by extreme values. A Box Plot effectively visualizes the spread and highlights outliers, making it ideal for understanding income distribution." A sports analyst is evaluating the performance scores of players from two different teams. Each team has 20 players. She wants to compare the distribution and variability of scores between the two teams. Which visualization is most appropriate?,"{'A': 'Side-by-Side Box Plots', 'B': 'Two Separate Scatter Plots', 'C': 'Combined Histogram', 'D': 'Dual Line Charts'}",A,"Side-by-Side Box Plots allow the analyst to compare the distribution and variability of scores between the two teams, highlighting differences in medians, spreads, and any outliers effectively." A data analyst is comparing the heights of plants grown under three different lighting conditions. Each group has 25 plants. She wants to compare the central tendency and variability of plant heights across the three groups. Which visualization should she use?,"{'A': 'Three Separate Histograms', 'B': 'Side-by-Side Box Plots', 'C': 'Scatter Plot', 'D': 'Stem Plot for each group'}",B,"Side-by-Side Box Plots allow the analyst to compare the distributions, central tendencies, and variability of plant heights across the three lighting conditions, making it easy to identify differences and outliers." "A psychologist is studying the reaction times of individuals under different levels of caffeine intake. She collects reaction times for 60 individuals across three caffeine levels: none, moderate, and high. She wants to visualize the distribution and identify any outliers in reaction times for each caffeine level. What should she use?","{'A': 'Grouped Scatter Plot', 'B': 'Side-by-Side Box Plots', 'C': 'Three Separate Stem Plots', 'D': 'Pie Charts for each group'}",B,"Side-by-Side Box Plots will effectively show the distribution, central tendency, and outliers for reaction times across the three different caffeine levels, allowing for easy comparison." "A retailer wants to analyze the distribution of transaction amounts to identify typical purchase sizes and any unusually large transactions. They have a dataset of 5,000 transactions ranging from \$1 to \$10,000. What should they use first to get a summary of the distribution and spot outliers?","{'A': 'Histogram', 'B': 'Box Plot', 'C': 'Scatter Plot', 'D': 'Stem Plot'}",B,"A Box Plot will provide a quick summary of the distribution, including the median, quartiles, and outliers. While histograms show distribution shape, box plots are more effective for identifying specific outliers in large datasets." A software developer is analyzing the number of bugs reported in different modules of an application. She has data for 200 bugs across 5 modules. She wants to compare the number of bugs in each module and identify any modules with unusually high bug counts. Which visualization should she use?,"{'A': 'Side-by-Side Box Plots', 'B': 'Grouped Scatter Plots', 'C': 'Multiple Histograms', 'D': 'Line Charts'}",A,Side-by-Side Box Plots will allow the developer to compare the distribution of bug counts across different modules and easily identify any modules with unusually high bug counts as outliers. "A business analyst is evaluating the time customers spend on a website before making a purchase. The dataset includes 1,000 observations with times ranging from 10 seconds to 2 hours. To understand the typical customer behavior and spot any unusually long or short sessions, which visualization should be used?","{'A': 'Box Plot', 'B': 'Histogram', 'C': 'Scatter Plot', 'D': 'Stem Plot'}",A,"A Box Plot is ideal for summarizing the central tendency and variability of the data while highlighting any outliers, such as unusually long or short session times." A project manager is analyzing the completion times of tasks across different teams. She has data from 10 teams with 50 tasks each. She wants to compare the variability and identify which teams have tasks that are consistently completed quickly or slowly. Which visualization should she use?,"{'A': 'Stacked Bar Chart', 'B': 'Side-by-Side Box Plots for each team', 'C': 'Histogram for each team', 'D': 'Scatter Plot comparing teams'}",B,"Side-by-Side Box Plots for each team allow the manager to compare the distribution of task completion times, identify variability, and spot any outliers or consistent patterns across different teams." "An economist is studying the distribution of household expenses in a city. She has data on monthly expenditures across various categories for 1,000 households. To identify typical spending ranges and any households with unusually high or low expenses, which visualization should she use?","{'A': 'Box Plot', 'B': 'Scatter Plot', 'C': 'Pie Chart', 'D': 'Stem Plot'}",A,"A Box Plot will allow the economist to visualize the distribution of household expenses, showing the median, quartiles, and any outliers effectively." A sports analyst is evaluating the performance scores of players from two different teams. Each team has 20 players. She wants to compare the distribution and variability of scores between the two teams. Which visualization is most appropriate?,"{'A': 'Side-by-Side Box Plots', 'B': 'Two Separate Scatter Plots', 'C': 'Combined Histogram', 'D': 'Dual Line Charts'}",A,"Side-by-Side Box Plots allow the analyst to compare the distribution and variability of scores between the two teams, highlighting differences in medians, spreads, and any outliers effectively." "A data analyst is reviewing the annual salaries of employees in different departments of a company. Most salaries range between \$40,000 and \$100,000, but there are a few salaries exceeding \$200,000. She wants to understand the spread and identify any exceptionally high salaries. What should she do first?","{'A': 'Calculate the mean salary for each department.', 'B':""Create a box plot for each department's salaries."", 'C': 'Plot a scatter plot of salary vs. years of experience.', 'D': 'Generate separate stem plots for each department.'}",B,"Creating a Box Plot for each department's salaries will allow the analyst to visualize the distribution, understand the spread using the IQR, and easily identify any outliers (exceptionally high salaries) across departments." "A researcher is comparing the recovery times of patients undergoing two different therapies. Therapy A has 50 patients with recovery times ranging from 10 to 30 days, while Therapy B has 50 patients with recovery times ranging from 15 to 45 days. She wants to compare the central tendency and variability of recovery times between the two therapies. Which visualization should she use?","{'A': 'Two Separate Histograms', 'B': 'Side-by-Side Box Plots', 'C': 'Scatter Plot', 'D': 'Stem Plot for each therapy'}",B,"Side-by-Side Box Plots allow the researcher to compare the central tendency and variability of recovery times between the two therapies, as well as identify any outliers in each group." "A data analyst is assessing the distribution of customer ages in a retail store. The ages range from 18 to 80, with most customers between 25 and 60 years old. She wants to identify any unusually young or old customers. Which measure and visualization should she use?","{'A': 'Mean age and Histogram', 'B': 'Median age and Box Plot', 'C': 'Mode age and Stem Plot', 'D': 'Range and Scatter Plot'}",B,"Using the Median age provides a central tendency measure unaffected by extreme values. A Box Plot effectively visualizes the spread and highlights outliers, making it ideal for identifying unusually young or old customers." "A project manager is analyzing the time taken by team members to complete various tasks. The completion times are as follows: \[30, 45, 50, 60, 60, 70, 80, 90, 100, 120\]. She wants to identify any tasks that took unusually long. What is the upper boundary for outliers using the IQR method?","{'A': 'Q3 + 1.5 × IQR', 'B': 'Mean + 2 × SD', 'C': 'Median + 1.5 × IQR', 'D': 'Maximum value in the dataset'}",A,"First, calculate Q1 (45) and Q3 (90). IQR = Q3 - Q1 = 45. Upper boundary = Q3 + 1.5 × IQR = 90 + (1.5 × 45) = 90 + 67.5 = 157.5. Any value above 157.5 is considered an outlier. In this dataset, 120 is not an outlier." "A researcher has the following dataset of exam scores: \[55, 60, 65, 70, 75, 80, 85, 90, 95, 150\]. Using the IQR method, which score is identified as an outlier?","{'A': '55', 'B': '90', 'C': '150', 'D': '65'}",C,"First, calculate Q1 (65) and Q3 (90). IQR = Q3 - Q1 = 25. Upper boundary = Q3 + 1.5 × IQR = 90 + 37.5 = 127.5. Any value above 127.5 is an outlier. Therefore, 150 is an outlier." "A data analyst is comparing the variability of two different datasets. Dataset A has an IQR of 20, and Dataset B has an IQR of 50. Which dataset has greater variability in the middle 50% of its data?","{'A': 'Dataset A', 'B': 'Dataset B', 'C': 'Both have the same variability', 'D': 'Cannot determine without additional information'}",B,"Dataset B has an IQR of 50, which is greater than Dataset A's IQR of 20. Therefore, Dataset B has greater variability in the middle 50% of its data." "A teacher has recorded the time (in minutes) it takes her 25 students to complete a particular assignment. The times range from 10 minutes to 120 minutes, with most students completing it between 20 and 60 minutes. She wants to identify any students who took an unusually long time to finish. After calculating Q1 = 20, Q3 = 60, and IQR = 40, what is the upper boundary for identifying outliers?","{'A': '60 + (1.5 × 40) = 120', 'B': '60 + (1.5 × 40) = 100', 'C': '20 + (1.5 × 40) = 80', 'D': '20 + (1.5 × 40) = 60'}",A,"Upper boundary = Q3 + 1.5 × IQR = 60 + (1.5 × 40) = 60 + 60 = 120. Any time above 120 minutes would be considered an outlier. In this dataset, 120 minutes is the maximum and not an outlier." "A data analyst is examining the ages of participants in a survey. The ages are: \[22, 25, 28, 30, 32, 35, 38, 40, 42, 45, 50, 55\]. She wants to determine if age 55 is an outlier using the IQR method. First, she calculates Q1 = 25, Q3 = 42, and IQR = 17.5. What is the upper boundary for outliers?","{'A': '42 + (1.5 × 17.5) = 70.25', 'B': '25 + (1.5 × 17.5) = 51.25', 'C': '25 - (1.5 × 17.5) = 5.75', 'D': '42 - (1.5 × 17.5) = 15.25'}",A,"Upper boundary = Q3 + 1.5 × IQR = 42 + (1.5 × 17.5) = 42 + 26.25 = 68.25. Since 55 \< 68.25, age 55 is not an outlier." A data analyst adjusts the multiplier in the IQR method from 1.5 to 3 to identify outliers in a dataset. What is the effect of this change?,"{'A': 'More data points will be classified as outliers.', 'B': 'Fewer data points will be classified as outliers.', 'C': 'It has no effect on the number of outliers.', 'D': 'It only affects the lower boundary, not the upper.'}",B,"Increasing the multiplier from 1.5 to 3 makes the boundaries wider, resulting in fewer data points being classified as outliers." "You have two datasets: Dataset X is symmetric with no outliers, and Dataset Y is skewed with several outliers. Which measure of central tendency and spread is more appropriate for Dataset Y?","{'A': 'Mean and Standard Deviation', 'B': 'Median and IQR', 'C': 'Mode and Range', 'D': 'Mean and IQR'}",B,"For skewed data with outliers, the Median and IQR are more appropriate as they are less affected by extreme values compared to the Mean and Standard Deviation." A data analyst is working with a highly skewed dataset. Which measure of spread should she prefer to understand the variability of the data?,"{'A': 'Range', 'B': 'Standard Deviation', 'C': 'Interquartile Range (IQR)', 'D': 'Variance'}",C,The Interquartile Range (IQR) is preferred for skewed datasets as it measures the spread of the middle 50% of the data and is not affected by extreme outliers. "A data analyst is reviewing the annual sales figures for a company. Most sales values range between \$50,000 and \$150,000, but there are a few entries showing sales over \$500,000. Which visualization would best help the analyst identify these high sales outliers?","{'A': 'Histogram', 'B': 'Box Plot', 'C': 'Stem Plot', 'D': 'Scatter Plot'}",B,"A Box Plot is ideal for identifying outliers as it clearly shows values outside the whiskers. Histograms and stem plots are less effective for spotting outliers, and scatter plots are more suited for showing relationships between two variables." You have a dataset of monthly electricity consumption (in kWh) for 12 households. The values range from 300 to 1500 kWh. You want to understand the central tendency and variability without being affected by extreme values. Which measure and visualization should you use?,"{'A': 'Mean and Histogram', 'B': 'Median and Box Plot', 'C': 'Mode and Stem Plot', 'D': 'Mean and Box Plot'}",B,"Using the Median provides a central tendency measure that is not skewed by outliers. A Box Plot visually summarizes the distribution and highlights any extreme values, making it the best choice for understanding variability without the influence of outliers." A researcher is comparing the test scores of two different teaching methods. Each method has 30 students. Which visualization would best allow the researcher to compare the distributions and identify any differences or outliers between the two groups?,"{'A': 'Side-by-Side Box Plots', 'B': 'Single Histogram', 'C': 'Scatter Plot', 'D': 'Stem Plot'}",A,"Side-by-Side Box Plots allow for an easy comparison of the distributions, medians, and outliers between the two teaching methods. Histograms would require separate plots and are less effective for direct comparison. Scatter plots are for relationships between variables, and stem plots are better for smaller datasets." "In analyzing the daily stock prices of a company, you observe that most prices lie between \$50 and \$150, but there are a few days where the price drops below \$30 or rises above \$200. What is the most appropriate method to visualize these price variations and identify outliers?","{'A': 'Histogram', 'B': 'Box Plot', 'C': 'Line Chart', 'D': 'Bar Chart'}",B,"A Box Plot effectively summarizes the distribution of stock prices and highlights outliers as points beyond the whiskers. While histograms show distribution, they are less effective at pinpointing specific outliers. Line and bar charts are not ideal for this purpose." A data scientist is working with a small dataset of 10 employee ages in a company. She wants to display each individual age while also understanding the overall distribution. Which visualization should she choose?,"{'A': 'Histogram', 'B': 'Box Plot', 'C': 'Stem-and-Leaf Plot', 'D': 'Scatter Plot'}",C,"A Stem-and-Leaf Plot is perfect for small datasets as it displays individual data points while also showing the distribution. Histograms and box plots are better for larger datasets, and scatter plots are for relationships between variables." "You are analyzing the loan data for a bank, which includes interest rates and loan amounts (notionals). The interest rates range from 3% to 12%, and loan amounts range from \$5,000 to \$500,000. You notice that most loans have interest rates between 4% and 8%, but there are a few loans with rates above 10%. What should be your first step to understand the distribution and identify any outliers in the interest rates?","{'A': 'Calculate the mean and standard deviation of the interest rates.', 'B': 'Create a box plot of the interest rates.', 'C': 'Plot a scatter plot of interest rates against loan amounts.', 'D': 'Generate a stem-and-leaf plot for the interest rates.'}",B,"Creating a box plot is the best first step as it visually summarizes the distribution of interest rates, highlights the median, quartiles, and easily identifies outliers. While calculating mean and standard deviation or using a stem plot can provide additional insights, a box plot offers a clear and immediate visual representation of the data's spread and any anomalies." "A marketing analyst is examining customer purchase amounts from an online store. Most purchases range between \$20 and \$200, but there are a few transactions exceeding \$1,000. Which visualization should the analyst use first to identify these high-value transactions?","{'A': 'Histogram', 'B': 'Box Plot', 'C': 'Scatter Plot', 'D': 'Pie Chart'}",B,"A Box Plot is ideal for identifying outliers as it clearly displays values outside the whiskers. Histograms provide distribution but are less effective for spotting specific outliers. Scatter plots are for relationships between variables, and pie charts are not suitable for this purpose." "A project manager is analyzing the time taken by team members to complete various tasks. The completion times vary widely, and some tasks took significantly longer than others. To understand the overall distribution and identify any tasks that took unusually long, which visualization should she use?","{'A': 'Histogram', 'B': 'Box Plot', 'C': 'Scatter Plot', 'D': 'Stem Plot'}",B,"A Box Plot will provide a summary of the completion times, showing the median, quartiles, and any outliers, making it easy to identify tasks that took unusually long." "A health researcher is analyzing the blood pressure readings of 60 patients. The readings range from 80 mmHg to 200 mmHg, with most values between 90 mmHg and 140 mmHg. She wants to determine if there are any extreme cases of high or low blood pressure. What should she do first?","{'A': 'Calculate the mean and standard deviation', 'B': 'Create a Box Plot of the blood pressure readings', 'C': 'Plot a Scatter Plot against age', 'D': 'Generate a Pie Chart of the readings'}",B,"Creating a Box Plot will allow the researcher to visualize the distribution of blood pressure readings, easily identifying any outliers (extreme cases) beyond the whiskers." An HR manager is analyzing the salaries of employees in different departments. She wants to compare the salary distributions and identify any departments with unusually high or low salaries. Which visualization should she use?,"{'A': 'Side-by-Side Box Plots', 'B': 'Grouped Scatter Plots', 'C': 'Multiple Histograms', 'D': 'Line Charts'}",A,"Side-by-Side Box Plots are perfect for comparing salary distributions across different departments, highlighting medians, spreads, and any outliers in each department." "A quality control manager is monitoring the diameter of bolts produced by a machine. She collects a sample of 100 bolts and notices that most diameters are between 5.0 mm and 5.5 mm, but a few measurements are outside this range. Which statistical measure and visualization should she use to assess the variability and identify any defective bolts?","{'A': 'Mean and Histogram', 'B': 'Median and Stem Plot', 'C': 'IQR and Box Plot', 'D': 'Mode and Scatter Plot'}",C,Using the IQR with a Box Plot allows the manager to assess the variability within the middle 50% of the data and easily identify any outliers (defective bolts) that fall outside the whiskers. A software developer is analyzing the number of bugs reported in different modules of an application. She has data for 200 bugs across 5 modules. She wants to compare the distribution of bug counts across modules and identify any modules with unusually high bug counts. Which visualization should she use?,"{'A': 'Side-by-Side Box Plots', 'B': 'Grouped Scatter Plots', 'C': 'Multiple Histograms', 'D': 'Stacked Bar Charts'}",A,Side-by-Side Box Plots will allow the developer to compare the distribution of bug counts across different modules and easily identify any modules with unusually high bug counts as outliers. A data analyst is comparing the lifespans of light bulbs from two different manufacturers. She has data for 40 bulbs from each manufacturer. She wants to compare the central tendency and variability of lifespans between the two groups and identify any outliers. Which visualization should she use?,"{'A': 'Two Separate Histograms', 'B': 'Side-by-Side Box Plots', 'C': 'Scatter Plot', 'D': 'Dual Line Charts'}",B,"Side-by-Side Box Plots are ideal for comparing the central tendency and variability of lifespans between the two manufacturers, as well as identifying any outliers in each group." "An economist is studying the distribution of household expenses in a city. She has data on monthly expenditures across various categories for 1,000 households. To identify typical spending ranges and any households with unusually high or low expenses, which visualization should she use?","{'A': 'Box Plot', 'B': 'Scatter Plot', 'C': 'Pie Chart', 'D': 'Stem Plot'}",A,"A Box Plot will allow the economist to visualize the distribution of household expenses, showing the median, quartiles, and any outliers effectively." "A business analyst wants to analyze the time customers spend on a website before making a purchase. The dataset includes 1,000 observations with times ranging from 10 seconds to 2 hours. To understand the typical customer behavior and identify any unusually long or short sessions, which visualization should be used?","{'A': 'Box Plot', 'B': 'Scatter Plot', 'C': 'Pie Chart', 'D': 'Histogram'}",A,"A Box Plot is ideal for summarizing the central tendency and variability of the data while highlighting any outliers, such as unusually long or short session times." "A university researcher is analyzing the distribution of GPA scores among students in different majors. She has data on GPA for 500 students across five majors. To compare the central tendency and variability of GPAs across majors and identify any outliers, which visualization should she use?","{'A': 'Side-by-Side Box Plots', 'B': 'Grouped Scatter Plots', 'C': 'Multiple Histograms', 'D': 'Pie Charts for each major'}",A,"Side-by-Side Box Plots are ideal for comparing the central tendency and variability of GPA scores across multiple groups (majors), allowing the researcher to easily identify differences and outliers." "A data analyst is evaluating the time taken by team members to complete various tasks. The completion times vary widely, and some tasks took significantly longer than others. To understand the overall distribution and identify any tasks that took unusually long, which visualization should she use?","{'A': 'Histogram', 'B': 'Box Plot', 'C': 'Scatter Plot', 'D': 'Stem Plot'}",B,"A Box Plot will provide a summary of the completion times, showing the median, quartiles, and any outliers, making it easy to identify tasks that took unusually long." A sales manager wants to compare the sales performance of different regions. Each region has sales figures for 50 products. She wants to see the distribution of sales and identify any regions with exceptionally high or low sales. Which visualization should she use?,"{'A': 'Side-by-Side Box Plots', 'B': 'Grouped Scatter Plots', 'C': 'Multiple Histograms', 'D': 'Stacked Bar Charts'}",A,"Side-by-Side Box Plots allow the manager to compare the distribution of sales across different regions, easily identifying regions with unusually high or low sales through outliers." "A psychologist is studying the reaction times of individuals under different levels of caffeine intake. She collects reaction times for 60 individuals across three caffeine levels: none, moderate, and high. She wants to visualize the distribution and identify any outliers in reaction times for each caffeine level. What should she use?","{'A': 'Grouped Scatter Plot', 'B': 'Side-by-Side Box Plots', 'C': 'Three Separate Stem Plots', 'D': 'Three Separate Histograms'}",B,"Side-by-Side Box Plots will effectively show the distribution, central tendency, and outliers for reaction times across the three different caffeine levels, allowing for easy comparison." "A finance analyst is examining the quarterly returns of 100 different stocks. Most returns fall between -5% and +5%, but a few stocks have returns exceeding +20% or dropping below -15%. Which visualization should the analyst use to summarize the distribution and identify outliers?","{'A': 'Histogram', 'B': 'Box Plot', 'C': 'Line Chart', 'D': 'Bar Chart'}",B,"A Box Plot provides a summary of the distribution, highlighting the median, quartiles, and outliers effectively. Histograms show distribution but are less effective at identifying specific outliers. Line and bar charts are not suitable for this analysis." "An environmental scientist is studying the annual rainfall in two different regions over the past 50 years. She wants to compare rainfall patterns, understand variability, and identify any anomalies in each region. Which visualization should she use?","{'A': 'Side-by-Side Box Plots for both regions', 'B': 'Scatter Plot comparing both regions', 'C': 'Two Separate Histograms', 'D': 'Pie Charts for each region'}",A,"Side-by-Side Box Plots are ideal for comparing the central tendency and variability between two groups, as well as identifying any outliers or anomalies in each region." "A data analyst is reviewing the lifespans of electronic devices from two different manufacturers. She has data for 100 devices from each manufacturer. She wants to compare the distributions, identify any differences in variability, and spot any outliers. Which visualization should she use?","{'A': 'Two Separate Histograms', 'B': 'Side-by-Side Box Plots', 'C': 'Scatter Plot', 'D': 'Dual Line Charts'}",B,"Side-by-Side Box Plots allow the analyst to compare the distributions, medians, quartiles, and outliers between the two manufacturers effectively." "A teacher is analyzing the completion times of assignments for her class of 25 students. The times range from 10 minutes to 120 minutes, with most students completing the assignment between 20 and 60 minutes. She wants to identify any students who took an unusually long time to finish. Which visualization should she use?","{'A': 'Box Plot', 'B': 'Histogram', 'C': 'Scatter Plot', 'D': 'Pie Chart'}",A,A Box Plot will allow the teacher to visualize the distribution of completion times and easily identify any outliers who took significantly longer than the rest. "A data analyst is evaluating the time customers spend on a website before making a purchase. The dataset includes 1,000 observations with times ranging from 10 seconds to 2 hours. To understand the typical customer behavior and identify any unusually long or short sessions, which visualization should be used?","{'A': 'Box Plot', 'B': 'Scatter Plot', 'C': 'Pie Chart', 'D': 'Histogram'}",A,"A Box Plot is ideal for summarizing the central tendency and variability of the data while highlighting any outliers, such as unusually long or short session times." A researcher is analyzing the number of hours students study each week and their corresponding exam scores. She has data for 100 students. She wants to determine if there is a relationship between study hours and exam performance. Which visualization should she use?,"{'A': 'Box Plot', 'B': 'Scatter Plot', 'C': 'Histogram', 'D': 'Stem Plot'}",B,"A Scatter Plot is ideal for visualizing the relationship between two quantitative variables, allowing the researcher to see if there's a correlation between study hours and exam scores." "A business analyst is comparing the distribution of sales revenue between two different product lines. Each product line has 200 sales transactions. She wants to compare the central tendency, variability, and identify any outliers in each product line. Which visualization should she use?","{'A': 'Two Separate Histograms', 'B': 'Side-by-Side Box Plots', 'C': 'Scatter Plot', 'D': 'Dual Line Charts'}",B,"Side-by-Side Box Plots allow the analyst to compare the distributions, medians, quartiles, and outliers between the two product lines effectively." "An economist is studying the income distribution of households in a city. She has data on annual incomes ranging from \$20,000 to \$1,000,000. To understand the spread and identify any exceptionally high or low incomes, which measure and visualization should she prioritize?","{'A': 'Mean income and Histogram', 'B': 'Median income and Box Plot', 'C': 'Mode income and Stem Plot', 'D': 'Range and Scatter Plot'}",B,"Using the Median income provides a central tendency measure unaffected by extreme values. A Box Plot effectively visualizes the spread and highlights outliers, making it ideal for understanding income distribution." A healthcare analyst is examining the recovery times (in days) of patients undergoing three different treatment plans. She has data for 90 patients across the three plans. She wants to compare the distributions and identify any outliers in recovery times for each treatment plan. Which visualization should she use?,"{'A': 'Side-by-Side Box Plots', 'B': 'Grouped Scatter Plots', 'C': 'Three Separate Stem Plots', 'D': 'Three Separate Histograms'}",A,"Side-by-Side Box Plots will allow the analyst to compare the distributions of recovery times across the three treatment plans, highlighting medians, spreads, and any outliers effectively." "A data analyst is comparing the lifespans of electronic devices from two different manufacturers. She has data for 100 devices from each manufacturer. She wants to compare the distributions, identify any differences in variability, and spot any outliers. Which visualization should she use?","{'A': 'Two Separate Histograms', 'B': 'Side-by-Side Box Plots', 'C': 'Scatter Plot', 'D': 'Dual Line Charts'}",B,"Side-by-Side Box Plots allow the analyst to compare the distributions, medians, quartiles, and outliers between the two manufacturers effectively." A teacher wants to compare the test scores of her class against the school average. She has her class's 25 test scores and the school's overall 300 test scores. Which visualization would best help her compare her class's performance to the school's distribution?,"{'A': 'Overlayed Histograms', 'B': 'Side-by-Side Box Plots', 'C': 'Scatter Plot', 'D': 'Stem Plot for each group'}",B,"Side-by-Side Box Plots allow for a direct comparison of the two distributions, showing medians, quartiles, and any outliers for both the class and the school. Overlayed histograms can be cluttered with different dataset sizes, scatter plots are for relationships, and stem plots are not suitable for this comparison." A sports analyst is evaluating the performance scores of players from two different teams. Each team has 20 players. She wants to compare the distribution and variability of scores between the two teams. Which visualization is most appropriate?,"{'A': 'Side-by-Side Box Plots', 'B': 'Two Separate Scatter Plots', 'C': 'Combined Histogram', 'D': 'Dual Line Charts'}",A,"Side-by-Side Box Plots allow for an easy comparison of the distributions, medians, and outliers between the two teams. Histograms would require separate plots and are less effective for direct comparison. Scatter plots are for relationships between variables, and line charts are not suitable for this purpose." "A data analyst is reviewing the lifespans of electronic devices from two different manufacturers. She has data for 100 devices from each manufacturer. She wants to compare the distributions, identify any differences in variability, and spot any outliers. Which visualization should she use?","{'A': 'Two Separate Histograms', 'B': 'Side-by-Side Box Plots', 'C': 'Scatter Plot', 'D': 'Dual Line Charts'}",B,"Side-by-Side Box Plots allow the analyst to compare the distributions, medians, quartiles, and outliers between the two manufacturers effectively." "A psychologist is studying the reaction times of individuals under different levels of caffeine intake. She collects reaction times for 60 individuals across three caffeine levels: none, moderate, and high. She wants to visualize the distribution and identify any outliers in reaction times for each caffeine level. What should she use?","{'A': 'Grouped Scatter Plot', 'B': 'Side-by-Side Box Plots', 'C': 'Three Separate Stem Plots', 'D': 'Three Separate Histograms'}",B,"Side-by-Side Box Plots will effectively show the distribution, central tendency, and outliers for reaction times across the three different caffeine levels, allowing for easy comparison." "A retailer wants to analyze the distribution of transaction amounts to identify typical purchase sizes and any unusually large transactions. They have a dataset of 5,000 transactions ranging from \$1 to \$10,000. What should they use first to get a summary of the distribution and spot outliers?","{'A': 'Histogram', 'B': 'Box Plot', 'C': 'Scatter Plot', 'D': 'Stem Plot'}",B,"A Box Plot will provide a quick summary of the distribution, including the median, quartiles, and outliers. While histograms show distribution shape, box plots are more effective for identifying specific outliers in large datasets." "A university researcher is analyzing the distribution of GPA scores among students in different majors. She has data on GPA for 500 students across five majors. To compare the central tendency and variability of GPAs across majors and identify any outliers, which visualization should she use?","{'A': 'Side-by-Side Box Plots', 'B': 'Grouped Scatter Plots', 'C': 'Multiple Histograms', 'D': 'Pie Charts for each major'}",A,"Side-by-Side Box Plots are ideal for comparing the central tendency and variability of GPA scores across multiple groups (majors), allowing the researcher to easily identify differences and outliers." "A quality control manager is monitoring the diameter of bolts produced by a machine. She collects a sample of 100 bolts and notices that most diameters are between 5.0 mm and 5.5 mm, but a few measurements are outside this range. Which statistical measure and visualization should she use to assess the variability and identify any defective bolts?","{'A': 'Mean and Histogram', 'B': 'Median and Stem Plot', 'C': 'IQR and Box Plot', 'D': 'Mode and Scatter Plot'}",C,Using the IQR with a Box Plot allows the manager to assess the variability within the middle 50% of the data and easily identify any outliers (defective bolts) that fall outside the whiskers.