Midterm Exam Prep

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

The midterm exam will be available for 90 minutes within a how long of a window?

  • 48-hour
  • 72-hour
  • 12-hour
  • 24-hour (correct)

The midterm exam is designed to be completed without the use of notes or a calculator.

False (B)

What is the best way to start preparing for the cumulative midterm exam?

Create an overview page outlining the highlights of each class meeting to refresh memory.

Instead of reviewing homework problems in sequential order, it may be more helpful to approach the review in a more ______ way.

<p>random</p> Signup and view all the answers

Match the following statistical concepts with their descriptions:

<p>Alpha level = The probability of rejecting the null hypothesis when it is true. P-value = The probability of obtaining results as extreme as the observed results of a statistical hypothesis test, assuming that the null hypothesis is correct. Confidence Interval = A range of values likely to contain a population parameter. Critical Value = A point on the test distribution that is compared to the test statistic to determine whether to reject the null hypothesis.</p> Signup and view all the answers

Which of the following best describes the Central Limit Theorem?

<p>The distribution of sample means approaches a normal distribution as the sample size increases. (D)</p> Signup and view all the answers

A larger confidence level will result in a narrower confidence interval, assuming all other factors are constant.

<p>False (B)</p> Signup and view all the answers

If a hypothesis test results in a p-value less than the alpha level, what decision should be made regarding the null hypothesis?

<p>The null hypothesis should be rejected.</p> Signup and view all the answers

__________ statistics involve using sample data to make inferences or generalizations about a population.

<p>Inferential</p> Signup and view all the answers

Match each measure of center with its primary characteristic:

<p>Mean = The average of all values in a dataset. Median = The middle value when a dataset is ordered. Mode = The most frequently occurring value in a dataset.</p> Signup and view all the answers

In a normal distribution, approximately what percentage of the data falls within one standard deviation of the mean?

<p>68% (A)</p> Signup and view all the answers

The standard deviation is a measure of the center of a dataset.

<p>False (B)</p> Signup and view all the answers

Define the term 'population parameter'.

<p>A numerical value that describes a characteristic of a population.</p> Signup and view all the answers

The __________ is the range of values within which a population parameter is likely to fall, with a specified level of confidence.

<p>Confidence Interval</p> Signup and view all the answers

Match the following terms related to hypothesis testing:

<p>Null Hypothesis = A statement of no effect or no difference. Alternate Hypothesis = A statement that contradicts the null hypothesis. Rejection Region = The set of values for which the null hypothesis is rejected.</p> Signup and view all the answers

Which of the following is NOT a measure of variability?

<p>Median (C)</p> Signup and view all the answers

A z-score represents the number of standard deviations a data point is from the median.

<p>False (B)</p> Signup and view all the answers

What does a 'skewed' distribution indicate about the data?

<p>It indicates that the data is not symmetrical.</p> Signup and view all the answers

The __________ is the probability of observing a test statistic as extreme as, or more extreme than, the one computed, assuming the null hypothesis is true.

<p>p-value</p> Signup and view all the answers

Match the following data types with their corresponding measurement levels:

<p>Categorical Data = Nominal or Ordinal Numerical Data = Interval or Ratio</p> Signup and view all the answers

Which level of measurement allows for meaningful ratio comparisons?

<p>Ratio (B)</p> Signup and view all the answers

Ordinal data has equal intervals between values.

<p>False (B)</p> Signup and view all the answers

What is the difference between a sample statistic and a population parameter?

<p>A sample statistic describes a characteristic of a sample, while a population parameter describes a characteristic of a population.</p> Signup and view all the answers

A __________ distribution is a probability distribution that describes the probabilities of all possible values for a discrete random variable.

<p>Probability Mass</p> Signup and view all the answers

Match each term with its definition:

<p>Skewness = A measure of the asymmetry of a probability distribution. Variance = A measure of how spread out a set of data is (the square of the standard deviation). Standard Deviation = A measure of the amount of variation or dispersion of a set of values.</p> Signup and view all the answers

If a dataset has a positive skew, which of the following is generally true?

<p>The mean is greater than the median. (A)</p> Signup and view all the answers

The range is a resistant measure of variability.

<p>False (B)</p> Signup and view all the answers

Define 'standard error of sample means'.

<p>The standard deviation of the sampling distribution of the sample means.</p> Signup and view all the answers

The __________ distribution is used when the population standard deviation is unknown, and the sample size is small.

<p>T</p> Signup and view all the answers

Match the following probability concepts:

<p>Theoretical Probability = Probability based on reasoning or calculation. Experimental Probability = Probability based on observations from experiments.</p> Signup and view all the answers

Which of the following is true about the relationship between the sample size and the standard error of the mean?

<p>As the sample size increases, the standard error of the mean decreases. (A)</p> Signup and view all the answers

A relative frequency histogram displays the number of observations in each category.

<p>False (B)</p> Signup and view all the answers

What is the purpose of calculating a confidence interval?

<p>To estimate a population parameter with a certain level of confidence.</p> Signup and view all the answers

A value of the population is known as a __________.

<p>parameter</p> Signup and view all the answers

Match the following descriptive statistic symbols to their names

<p>$\mu$ = Population Mean $\sigma$ = Population Standard Deviation $\bar{x}$ = Sample Mean s = Sample Standard Deviation</p> Signup and view all the answers

Which of the following conditions must be met to calculate a confidence interval using a z-score for population means?

<p>The sampling distribution is normal and the population standard deviation is known. (C)</p> Signup and view all the answers

The T-distribution is wider than the normal distribution

<p>True (A)</p> Signup and view all the answers

What factors affect the width of a confidence interval?

<p>Sample Size, Confidence Level, and Standard Deviation</p> Signup and view all the answers

The boundaries of the __________ are determined by the significance level.

<p>Rejection Region</p> Signup and view all the answers

Flashcards

Alpha Level

The probability threshold below which the Null Hypothesis is rejected.

Central Limit Theorem

Averages of samples will be normally distributed, regardless of the population's distribution.

Confidence Interval

A range of values likely to contain the true population parameter.

Confidence Level

The probability that the confidence interval contains the true parameter.

Signup and view all the flashcards

Test Statistics Criteria

Formulas used to determine if sample data supports rejecting the null hypothesis.

Signup and view all the flashcards

Critical Value

The value beyond which the test statistic leads to rejecting the null hypothesis.

Signup and view all the flashcards

Cumulative Density Function

Gives the probability that a random variable will be found at a value less than or equal to a specific value.

Signup and view all the flashcards

Data Types

Categorical or Numerical.

Signup and view all the flashcards

Levels of Measurement

Nominal, Ordinal, Interval, Ratio.

Signup and view all the flashcards

Descriptive Statistics

Summarizing and presenting data (mean, median, mode, standard deviation).

Signup and view all the flashcards

Frequency Distribution

Shows the number of observations for each category or value.

Signup and view all the flashcards

Frequency Table

A table that displays the frequencies of different categories or values.

Signup and view all the flashcards

Hypothesis Testing

A method for testing a claim about a population using sample data.

Signup and view all the flashcards

Hypothesized Mean/Proportion

The assumed value for the population parameter in the null hypothesis.

Signup and view all the flashcards

Inferential Statistics

Making inferences about a population based on sample data.

Signup and view all the flashcards

Likelihood of Test statistic

The probability of obtaining the observed test statistic, or more extreme, if the null hypothesis were true.

Signup and view all the flashcards

Margins of Error

How much the sample statistic might differ from the population parameter.

Signup and view all the flashcards

Mean of Sample Means

This is the average of all possible sample means, which equals the population mean.

Signup and view all the flashcards

Measures of Center

A number describing the 'center' of a data set.

Signup and view all the flashcards

Normal Probability Distribution

A symmetrical, bell-shaped distribution.

Signup and view all the flashcards

Null and Alternate Hypotheses

Statements about the population parameter being tested.

Signup and view all the flashcards

P-value

The probability of observing a test statistic as extreme as, or more extreme than, the one computed if the null hypothesis is true.

Signup and view all the flashcards

Point Estimate

A single value estimate of a population parameter.

Signup and view all the flashcards

Population Mean

The true average value in the entire population.

Signup and view all the flashcards

Population Parameter

A value that describes a population (e.g., population mean, population standard deviation).

Signup and view all the flashcards

Population Standard Deviation

Measures the spread of data in the entire population.

Signup and view all the flashcards

Probability

Chance or likelihood of an event occurring.

Signup and view all the flashcards

Probability Distribution Function

A function that assigns probabilities to outcomes of a random variable.

Signup and view all the flashcards

Probability Mass Function

A function that gives the probability that a discrete random variable is exactly equal to some value.

Signup and view all the flashcards

Probability of Random Variables

Likelihood associated with possible values.

Signup and view all the flashcards

Random Variable

A variable whose value is a numerical outcome of a random phenomenon.

Signup and view all the flashcards

Range

The difference between the highest and lowest values in a data set.

Signup and view all the flashcards

Rejection Region

The area of the distribution where the null hypothesis is rejected.

Signup and view all the flashcards

Relative Frequency Histogram

A histogram using relative frequencies (proportions) for each bin.

Signup and view all the flashcards

Sample Mean

The average value calculated from a sample.

Signup and view all the flashcards

Sample Standard Deviation

Measures the spread of data in a sample.

Signup and view all the flashcards

Sample Statistic

A value calculated from a sample to estimate population parameters.

Signup and view all the flashcards

Sampling Distribution

The distribution of a statistic across multiple samples.

Signup and view all the flashcards

Significance Level

The probability of rejecting the null hypothesis when it is true (Type I error).

Signup and view all the flashcards

Skewness

A measure of the asymmetry of a distribution.

Signup and view all the flashcards

Study Notes

  • The midterm exam will be available on Canvas on Thursday, March 27th, at 4:30 pm EST.
  • Students have 90 minutes to complete the exam within a 24-hour period.
  • The exam closes on Friday, March 28th, at 4:30 pm EST.
  • Notes and calculators are permitted during the exam.
  • The midterm is cumulative, covering all topics from the beginning of the semester.
  • Reviewing class meeting highlights and re-attempting homework and quiz problems are recommended for preparation.
  • Class slides should be reviewed, focusing on sections needing quick access during the test.
  • All formulas and tables required for the test will be provided.
  • Access the test via the "Quizzes" tab on Canvas, selecting "Midterm Exam".

Key Topics for the Midterm:

  • Alpha level: The probability of rejecting the null hypothesis when it is true, also known as a Type I error.
  • Central Limit Theorem: States that the distribution of sample means approximates a normal distribution as the sample size gets larger, regardless of the population's distribution.
  • Confidence Interval: A range of values likely to contain a population parameter with a certain level of confidence.
  • Confidence Level: The probability that a confidence interval contains the true population parameter.
  • Criteria for calculating test statistics for means and proportions: Conditions that must be met to ensure the validity of the test statistic.
  • Critical value: A point on the test distribution that is compared to the test statistic to determine whether to reject the null hypothesis.
  • Cumulative Density Function: A function giving the probability that a random variable is less than or equal to a certain value.
  • Data types and levels of measurements: Includes nominal, ordinal, interval, and ratio scales, and qualitative vs. quantitative data.
  • Descriptive statistics: Methods for summarizing and describing the main features of a data set.
  • Frequency Distribution: Shows the number of occurrences of each value in a data set.
  • Frequency Table: A table that lists each category of data and the number of occurrences for each category.
  • Hypothesis Testing: A method for testing a claim or hypothesis about a population parameter using sample data.
  • Hypothesized mean or proportion: The value assumed for the population mean or proportion in the null hypothesis.
  • Inferential statistics: Methods for drawing conclusions about a population based on sample data.
  • Likelihood of test statistics: The probability of obtaining a test statistic as extreme as, or more extreme than, the one actually observed, assuming the null hypothesis is true.
  • Margins of Error: The range of values above and below the sample statistic in a confidence interval.
  • Mean of sample means: The average of the means from multiple samples, which should approximate the population mean according to the Central Limit Theorem.
  • Measures of centers (mode, median, mean) as represented on different graphs: Different measures of central tendency and their representation in graphs.
  • Normal Probability Distribution: A symmetric, bell-shaped distribution with specific properties.
  • Null and Alternate hypotheses: The null hypothesis is a statement of no effect or no difference, while the alternate hypothesis is a statement that contradicts the null hypothesis.
  • P-value: The probability of obtaining a test statistic as extreme as, or more extreme than, the one actually observed, assuming the null hypothesis is true.
  • Point estimate: A single value that is used to estimate a population parameter.
  • Population mean: The average value of a variable in the entire population.
  • Population parameter: A numerical value that describes a characteristic of a population.
  • Population standard deviation: A measure of the spread or variability of data in the entire population.
  • Probability (Experimental and Theoretical): Experimental probability is based on actual experiments, while theoretical probability is based on mathematical calculations.
  • Probability Distribution Function: A function that describes the probability of a continuous random variable falling within a particular range of values.
  • Probability Mass Function: A function that gives the probability that a discrete random variable is exactly equal to some value.
  • Probability of random variables: The likelihood of different outcomes for a random variable.
  • Random variable: A variable whose value is a numerical outcome of a random phenomenon.
  • Range: The difference between the largest and smallest values in a data set.
  • Rejection Region: The set of values for the test statistic that leads to rejection of the null hypothesis.
  • Relative frequency histogram: Displays the proportion of occurrences for each class interval in a data set.
  • Sample mean: The average value of a variable in a sample.
  • Sample standard deviation: A measure of the spread or variability of data in a sample.
  • Sample statistic: A numerical value that describes a characteristic of a sample.
  • Sampling Distribution: The distribution of a statistic (e.g., sample mean) from multiple samples of the same size taken from the same population.
  • Significance level: The probability of rejecting the null hypothesis when it is true (Type I error).
  • Skewness: A measure of the asymmetry of a distribution.
  • Standard deviation: A measure of the spread or variability of data around the mean.
  • Standard Error of sample means: The standard deviation of the sampling distribution of sample means.
  • Standard Normal Probability Distribution: A normal distribution with a mean of 0 and a standard deviation of 1.
  • T-distribution: A probability distribution that is used to estimate population parameters when the sample size is small or when the population standard deviation is unknown.
  • T-scores: A type of standard score that tells how many standard deviations away from the mean a particular score is.
  • Variance: A measure of the spread or variability of data; the square of the standard deviation.
  • z-scores: A measure of how many standard deviations an element is from the mean.

Assignment 1

  • Question 1 asks to categorize the data types and measurement levels of statements.

    • Statement A: Anxiety scale score of 16 is twice as anxious as a score of 8.
      • Data type: Quantitative or Numerical
      • Measurement level: Ratio
    • Statement B: Participants identify as omnivore, vegetarian, vegan, or fruitarian.
      • Data type: Qualitative or Categorical
      • Measurement level: Nominal
    • Statement C: The difference between 6 and 8 is equivalent to the difference between 13 and 15.
      • Data type: Quantitative or Numerical
      • Measurement level: Interval
    • Statement D: Students earn a failing, passing, or distinction grade.
      • Data type: Qualitative or Categorical
      • Measurement level: Ordinal
  • Question 2 involves a graphical representation of midterm scores.

    • The dataset has four modes around 50, 62, 73, and 78.
    • The median falls around the 13th data point which is approximately 72.
    • The mean should be close to the median because the dataset is roughly symmetrical with most data points in the middle.
  • Question 3 deals with descriptive measures on a 20-point quiz.

    • Correcting an 18 to 16 changes the mode from 18 to no mode.
    • The median remains between the 5th and 6th data point, so it stays at 12.
    • The mean changes as it considers the value of every data point.
    • The range does not change as the maximum and minimum scores are the same.
    • Both variance and standard deviation will likely change because the mean changes.
  • Question 4 addresses the research process.

    • It starts with an observation of interest.
    • Ensuring no bias requires the collection of data.
    • A researchable question is formulated based on the unbiased observation.
    • A theory is provided to explain the observation.
    • A testable hypothesis is made based on the theory.
    • The type of data (Qualitative or Quantitative) to collect is determined.
    • A decision on how to collect the data needs to be made.
    • A brief description of how to analyze the data is provided.
    • The analysis of the data should inform your theory.
  • Students were asked to download R and RStudio.

Discussion: Harking, Sharking, and Tharking

  • Hollenbeck and Wright argue that "Tharking can promote the effectiveness and efficiency of both scientific inquiry and cumulative knowledge creation" and claim that the practice of Tharking is "beneficial to scientific progress and, in many cases, ethically required".

Assignment 2 Responses

  • Question 1 focuses on measures of center (mode, median, mean) as "typical" scores.
    • The mean is defined in terms of distances of data points from the center, where the total distance from the mean to points below is equal to the total distance to points above.
    • The mean uses all dataset measurements in its calculation.
    • The mode is always included in the dataset.
  • Question 2 requires indicating which measure of center can be used with a specific data type.
    • Nominal data type can use Mode: Yes, Median: No, Mean: No.
    • Ordinal data type can use Mode: Yes, Median: Sometimes, Mean: No.
    • Discreet data type can use Mode: Yes, Median: Sometimes, Mean: Sometimes.
    • Continuous data type can use Mode: Yes, Median: Yes, Mean: Yes.
    • Interval data type can use Mode: Yes, Median: Yes, Mean: Yes.
    • Ratio data type can use Mode: Yes, Median: Yes, Mean: Yes.
  • Question 3 compares the mode, median, and mean of different frequency distributions.
    • A normal frequency distribution has the mode, median, and mean close to each other.
    • Outliers in a right-skewed frequency distribution will pull the mean to the right.
    • Outliers in a left-skewed frequency distribution will pull the mean to the left.
  • Question 4 relates frequency distributions to probability.
    • A person who attended the car show is Not very likely to be 70 years old because only 3 people ages 70 and above attended.
    • A 30-year-old is Very likely to have attended because most of the attendees are ages 30 to 34 years.
    • The probability that an attendee at the car show is aged 15 to 19 years is 10/172 = 0.058.
  • Question 5 involves flipping three fair coins.
    • The sample space is Ω = {𝐻𝐻𝐻, 𝐻𝐻𝑇, 𝐻𝑇𝐻, 𝑇𝐻𝐻, 𝐻𝑇𝑇, 𝑇𝐻𝑇, 𝑇𝑇𝐻, 𝑇𝑇𝑇}.
    • The theoretical probability of getting at least two heads is 1/2 = 0.5.

Assignment 4 Responses

  • Variables measured in the survey are gender, age, systolic blood pressure, and diastolic blood pressure.
    • Gender is qualitative and nominal.
    • Age is quantitative and ratio.
    • Systolic and diastolic blood pressure are quantitative and ratio.
    • The data is multivariate.
  • Histograms of systolic blood pressure data for males and females were constructed.
    • More males are at risk for a stroke or a heart attack, given histograms. They have higher systolic blood pressure (>120) than women.
  • Three Sampling Distributions were created. -Sample Size, n=5 - Mean: 114.782 - Standard deviation: ]5.962651
    • Sample Size, n=50
      • Mean : 114.4064
      • Standard deviation: 1.80838
    • Sample Size, n=100
      • Mean: 114.2009
      • Standard deviation: 1.339379
  • The sampling distribution is used to calculate the confidence interval at the 98% confidence level.
    • The point estimate is chosen randomly from the 100 means of the samples of 50 data points (𝑥̅ = 115.38).
    • The Standard Error is 1.80838
    • The Critical Value is 2.33
  • A concluding statement about the confidence interval in regard to the population mean (111.17 to 119.60).

Assignment 5 Responses

  • Question 1: Identify the sample statistic.

    • The sample statistic used to create a confidence interval for the population parameters in situations.
    • Mean: Process is measured in minutes, so it's a continuous variable (manufacturing process).
    • Mean: Weight is measured in lbs, so it's a continuous variable (oat grain distributor).
    • Proportion: 16-liter bottles are countable (discreet) (bottling factory).
  • Question 2: Criteria for the Use of Scores :Confidence interval

    • Calculations - Population Mean
      • The sampling distribution is normal
      • SE, 𝜎!̅ 𝑖𝑠 𝑘𝑛𝑜𝑤𝑛 We may consider using 𝑠!̅.
      • Using 𝑠!̅ to estimate 𝜎!̅ adds uncertainty, so we use a different critical value, a 𝑡 − 𝑠𝑐𝑜𝑟𝑒, 𝑡 ∗ from a t-distribution table.
    • Calculations - Population Proportion
      • The sampling distribution is normal.𝑝̂ =
      • 𝑛𝑝̂ ≥ 10; # of successes ≥ 10
      • 𝑛𝑞; ≥ 10; # of failures ≥ 10
      • (4) 2) SE 𝜎$% is known We may consider using 𝑠!̅.Since 𝜎 is unknown, use & √( as 𝑠 is an unbiased estimator
  • Question 3 - Calculating the Point Estimate

    • Cumulative Probability
      • 𝑝̂ = 0.68
      • 𝑧 ∗ B𝐶𝐿 + )
  • F = 𝑧 ∗ B0.95 + +.+-

  • F = 𝑧 ∗ (. 95 +.025) = 𝑧 ∗ (0.975) = 1.96

    • What is the standard error
      • Two criteria for using 𝑧 − 𝑠𝑐𝑜𝑟𝑒𝑠 for population proportions
      • (no points lost on assignment for not checking assumptions)
        
      • The sampling distribution is normal.𝑝̂ =
      • 𝑛𝑝̂ ≥ 10; # of successes ≥ 10
      •   𝑛𝑞; ≥ 10; # of failures ≥ 10
        
      • (7) We know 𝜎$%
      • 𝜎$% = 𝜎√𝑛 → 𝑠√𝑛 = =𝑝̂ ∙ 𝑞; 𝑛
      • Since 𝜎 is unknown, use & √( as 𝑠 is an unbiased estimator

    -At the 95% confidence level, the Confidence Interval is likely to contain the true proportion of all students who take Intermediate Statistics to fulfill a degree program requirement.

  • Question 4 -Confidence interval , Calculating the Point Estimate

    •      Calculate
      
      • The two criteria to use a z-score are met
      •        (no points lost on assignment for not checking assumptions).
        
      •   The sampling distribution is normal.
        
      •             (i)  𝑛𝑝̂ ≥ (1011)(0.64) = 647 ≥ 10; # of successes ≥ 10
        
      •                     𝑛𝑞; ≥ (1011)(0.36) = 364 ≥ 10; # of successes ≥ 10
        
      •         (9)   We know 𝜎$%
        
      •   𝜎$% = 𝜎√𝑛 → 𝑠√𝑛 = =𝑝̂ ∙ 𝑞; 𝑛
        
      • Since 𝜎 is unknown, use & √( as 𝑠 is an unbiased estimator
      •          The true proportion of voters who are likely to vote for the independent candidate.
        
  • Question 5 - Calculating the Point Estimate - .The sampling distribution will be normally distributed for a - Sample size of 40 (≥ 30); however, the SE, 𝜎!̅ is 𝑢𝑛𝑘𝑛𝑜𝑤𝑛. The use of 𝑠!̅ to estimate 𝜎!̅ adds more uncertainty, so we use a 𝑡 ∗ (from a t-distribution table). - = 0.145 - We know 𝜎$%

  • At the 80% confidence level, the Confidence Interval 𝟎. 𝟏𝟒𝟒 𝒕𝒐 𝟎. 𝟏𝟒𝟔 moles is likely to contain the mean amount of copper precipitated from the solution over a half-hour*

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

More Like This

Alpha Level and Null Hypothesis Quiz
3 questions
Alpha Kappa Alpha History Quiz
14 questions
Alpha Phi Alpha History Quiz
16 questions
Use Quizgecko on...
Browser
Browser