STA301 - Statistics and Probability Concepts
48 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is the definition of an unbiased estimator?

  • An estimator that has the smallest variance among all possible estimators.
  • An estimator that always produces the exact true value of the parameter being estimated.
  • An estimator that is always consistent and efficient.
  • An estimator whose expected value is equal to the true value of the parameter being estimated. (correct)
  • What is the purpose of an experimental design?

  • To control for all possible confounding variables.
  • To ensure that the results of the experiment are generalizable to the entire population.
  • To collect data in a way that provides a basis for objective inference about the problem under study. (correct)
  • To collect data in a way that ensures the results are statistically significant.
  • Which variable is typically represented by 'Y' in regression analysis?

  • Confounding Variable
  • Control Variable
  • Independent Variable
  • Dependent Variable (correct)
  • What is the expected value of the sampling distribution given in the content?

    <p>2/3 (D)</p> Signup and view all the answers

    What is the variance of the sampling distribution given in the content?

    <p>1/3 (A)</p> Signup and view all the answers

    What is the purpose of calculating the mean and variance of a sampling distribution?

    <p>All of the above. (D)</p> Signup and view all the answers

    Which of the following is NOT a characteristic of a good experimental design?

    <p>Bias in the selection of subjects (D)</p> Signup and view all the answers

    In regression analysis, what is the relationship between the independent and dependent variables?

    <p>The independent variable influences the dependent variable. (B)</p> Signup and view all the answers

    What is the technical term for the difference between a sample statistic and the corresponding population parameter?

    <p>Sampling error (D)</p> Signup and view all the answers

    Which of the following is NOT a property of a hypergeometric distribution?

    <p>The trials are independent. (C)</p> Signup and view all the answers

    What is the formula for calculating sampling error?

    <p>$X - \mu$ (C)</p> Signup and view all the answers

    Which of the following scenarios accurately describes the use of a hypergeometric distribution?

    <p>A researcher draws 5 cards from a deck without replacement to determine the probability of getting all hearts. (A)</p> Signup and view all the answers

    What is the primary advantage of using the median as a measure of central tendency?

    <p>It is not influenced by extreme values in the dataset. (D)</p> Signup and view all the answers

    What condition indicates that a statistic is a biased estimator?

    <p>The expected value of the statistic is not equal to the true parameter. (D)</p> Signup and view all the answers

    What does the term "five-number summary" refer to?

    <p>The minimum, first quartile (Q1), median, third quartile (Q3), and maximum of a dataset. (C)</p> Signup and view all the answers

    What is a key disadvantage of using the median as a measure of central tendency?

    <p>It does not take into account all data points in the dataset. (C)</p> Signup and view all the answers

    What is the formula for the mathematical expectation (E) of a discrete random variable X?

    <p>E(X) = ∑ xi f(xi) (C)</p> Signup and view all the answers

    Which of the following is a property of the expected value of a random variable?

    <p>Both A and B (D)</p> Signup and view all the answers

    What is the purpose of a statistical test?

    <p>To test a hypothesis about a population parameter. (D)</p> Signup and view all the answers

    What is the general rule for determining if a sample is considered small or large?

    <p>A sample is considered small if n is less than or equal to 30 and large otherwise. (C)</p> Signup and view all the answers

    What does the Least Significant Difference (LSD) test determine?

    <p>The significance of the difference between two sample means. (A)</p> Signup and view all the answers

    What is the formula for the combined or pooled proportion of two samples, where p1 is the proportion of the first sample, n1 is the size of the first sample, and p2 and n2 are the proportion and size of the second sample, respectively?

    <p>(n1p1 + n2p2) / (n1 + n2) (C)</p> Signup and view all the answers

    Given a class interval of approximately 2.96 and a range of 14.8, what is the approximate number of classes?

    <p>5 (B)</p> Signup and view all the answers

    What does the coefficient of variation (C.V) measure?

    <p>The spread of a distribution relative to the mean. (A)</p> Signup and view all the answers

    What is the significance of the method of maximum likelihood in estimation?

    <p>It offers the most probable estimate based on observed data. (C)</p> Signup and view all the answers

    In a standard normal distribution, what is the value of the lower quartile?

    <p>-0.6745 (B)</p> Signup and view all the answers

    In the context of an ANOVA, what does SST represent?

    <p>Sum of squares total (B)</p> Signup and view all the answers

    What is the expected value of 2X if the expected value of X is 0.7?

    <p>1.4 (D)</p> Signup and view all the answers

    Which of the following statements is true regarding the probability distribution of a statistic?

    <p>The probability distribution of a statistic is called the sampling distribution. (D)</p> Signup and view all the answers

    What is the inter-quartile range in a standard normal distribution?

    <p>1.349 (D)</p> Signup and view all the answers

    In a hypothesis test, why is the critical region defined as z > 1.645 for α = 0.10?

    <p>It represents the rejection region for a one-tailed test with a significance level of 0.10. (C)</p> Signup and view all the answers

    Suppose a department claims that the average value exceeds Rs. 2500. What would be the null and alternative hypotheses to test this claim at a 0.05 level of significance?

    <p>H0: μ ≤ 2500, H1: μ &gt; 2500 (B)</p> Signup and view all the answers

    In the provided ANOVA table, what is the degrees of freedom for the 'Error' source?

    <p>8 (B)</p> Signup and view all the answers

    Which of the following scenarios represents mutually exclusive events?

    <p>Getting heads and tails when tossing a coin. (D)</p> Signup and view all the answers

    What is the value of β2 for a normal distribution?

    <p>3 (D)</p> Signup and view all the answers

    In the given ANOVA table, what specific term is represented by the 'MS' value?

    <p>Mean Square (D)</p> Signup and view all the answers

    Given the formula for calculating the test statistic z, which of the following values is NOT required to perform the hypothesis test?

    <p>μ (B)</p> Signup and view all the answers

    Which of these events is NOT an example of a partition in the context of probability?

    <p>The numbers on a standard die: 1, 2, 3, 4, 5, or 6. (B)</p> Signup and view all the answers

    According to Bayes' theorem, what does P(Ai/B) represent?

    <p>The probability of event Ai occurring given that event B has already occurred. (B)</p> Signup and view all the answers

    In the statement "The 95% confidence interval for the population mean is 1.3 to 4.7", what does "95% confidence" mean?

    <p>There is a 95% chance that the true population mean lies within the interval 1.3 to 4.7. (C)</p> Signup and view all the answers

    What is the formula used to calculate the chi-square goodness of fit test statistic?

    <p>$\chi^2 = \sum_{i=1}^{n} \frac{(o_i - e_i)^2}{e_i}$ (D)</p> Signup and view all the answers

    What is the approximate value of the F-statistic at a 0.05 significance level with 7 degrees of freedom in the numerator and 10 degrees of freedom in the denominator?

    <p>4.075 (D)</p> Signup and view all the answers

    In a z-test statistic for a proportion, if the sample proportion (X/n) is greater than the hypothesized proportion (p0), which of the following is used in the calculation of the z-statistic?

    <p>X + \frac{1}{2} - np0 (B)</p> Signup and view all the answers

    The formula $\sigma_{\hat{p}_1 - \hat{p}_2}$ calculates:

    <p>The standard deviation of the difference between two sample proportions (B)</p> Signup and view all the answers

    What is the value of q1 in the context of calculating the standard deviation $\sigma{\hat{p}_1 - \hat{p}_2}$ when p1 is 0.3?

    <p>0.7 (B)</p> Signup and view all the answers

    What is the null hypothesis in the scenario involving the mayoral candidate?

    <p>p1 - p2 &gt; 0.10 (C)</p> Signup and view all the answers

    Which of the following is the correct formula for the standard deviation of the difference between two sample proportions in the context of the mayoral candidate scenario?

    <p>$\sqrt{\frac{p_1q_1}{n_1} + \frac{p_2q_2}{n_2}}$ (C)</p> Signup and view all the answers

    What is the alternative hypothesis in the mayoral candidate scenario?

    <p>p1 - p2 &lt; 0.10 (D)</p> Signup and view all the answers

    Flashcards

    Un-Biased Estimator

    An estimator that is correct on average across many samples.

    Experimental Design

    A plan to collect data objectively for inference.

    Independent Variable

    The variable manipulated in an experiment, often X in regression.

    Dependent Variable

    The outcome measured in an experiment, represented by Y.

    Signup and view all the flashcards

    Sampling Distribution

    A distribution of a statistic obtained from a large number of samples.

    Signup and view all the flashcards

    Mean of Sampling Distribution

    The average value of the sample statistics across all samples.

    Signup and view all the flashcards

    Variance of Sampling Distribution

    A measure of how much the sample statistics vary from the mean.

    Signup and view all the flashcards

    Probability in Sampling

    The likelihood of a sample statistic occurring, represented as fractions.

    Signup and view all the flashcards

    Sampling Error

    The difference between the sample estimate and the true population value.

    Signup and view all the flashcards

    Hypergeometric Distribution

    A distribution where trials have changing success probabilities and are not independent.

    Signup and view all the flashcards

    Five-Number Summary

    A set of five key statistics: minimum, Q1, median, Q3, and maximum.

    Signup and view all the flashcards

    Bias in Estimators

    Occurs when the expected value of a statistic does not equal the true value.

    Signup and view all the flashcards

    Advantages of Median

    The median is easy to interpret and not affected by extreme values.

    Signup and view all the flashcards

    Disadvantages of Median

    Requires sorting data and does not consider all item values.

    Signup and view all the flashcards

    Properties of Trials

    In hypergeometric trials, outcomes lead to dependent results and changing probabilities.

    Signup and view all the flashcards

    Ogives in Statistics

    Graphical method to locate the median in a grouped frequency distribution.

    Signup and view all the flashcards

    Mathematical Expectation

    The sum of products of values and their probabilities of a discrete random variable.

    Signup and view all the flashcards

    Formula for Expected Value

    E(X) = Σ xi f(xi) where xi are values and f(xi) are probabilities.

    Signup and view all the flashcards

    Property of E(c)

    If c is a constant, then E(c) = c; expected value of a constant is itself.

    Signup and view all the flashcards

    Property of E(aX + b)

    E(aX + b) = aE(X) + b; relates linear transformations to expectation.

    Signup and view all the flashcards

    Statistical Test

    A statistic that tests a null hypothesis with an associated probability distribution.

    Signup and view all the flashcards

    Small Sample Size

    Typically defined as sample size n <= 30 for statistical tests.

    Signup and view all the flashcards

    Large Sample Size

    A sample size greater than 30, often offering more robust results.

    Signup and view all the flashcards

    Combined Proportion Formula

    The formula for pooled proportion is pc = (n1 p1 + n2 p2) / (n1 + n2).

    Signup and view all the flashcards

    Maximum Likelihood Method

    A statistical method to estimate parameters by maximizing the likelihood function.

    Signup and view all the flashcards

    Likelihood Function

    A function that measures how well a statistical model explains the observed data.

    Signup and view all the flashcards

    ANOVA Table

    A table that summarizes the analysis of variance between different groups.

    Signup and view all the flashcards

    Standard Normal Distribution

    A normal distribution with mean 0 and standard deviation 1.

    Signup and view all the flashcards

    Lower Quartile (Q1)

    The value below which 25% of data falls in a dataset.

    Signup and view all the flashcards

    Upper Quartile (Q3)

    The value below which 75% of data falls in a dataset.

    Signup and view all the flashcards

    Inter-quartile Range

    The range between the upper quartile and lower quartile (Q3 - Q1).

    Signup and view all the flashcards

    Mean Deviation

    The average of absolute deviations from the mean.

    Signup and view all the flashcards

    Null Hypothesis (H0)

    The hypothesis stating that a parameter is less than a specific value.

    Signup and view all the flashcards

    Alternative Hypothesis (H1)

    The hypothesis stating that a parameter is greater than a specific value.

    Signup and view all the flashcards

    Baye's Theorem

    A formula to find the probability of an event based on prior knowledge.

    Signup and view all the flashcards

    Confidence Interval

    A range that likely contains the population parameter with a specified confidence level.

    Signup and view all the flashcards

    Covariance

    A measure indicating the extent to which two variables change together.

    Signup and view all the flashcards

    Mutually Exclusive Events

    Events that cannot occur at the same time.

    Signup and view all the flashcards

    Kurtosis

    A statistical measure describing the distribution's tails and peakness.

    Signup and view all the flashcards

    Chi-Square Test Statistic

    A formula used to assess how observed frequencies differ from expected frequencies.

    Signup and view all the flashcards

    F-Table Value

    The critical value from the F-distribution used for hypothesis testing of variances, depends on sample sizes and alpha.

    Signup and view all the flashcards

    Z-Test Statistic for Proportion

    A statistic used to determine if a sample proportion is significantly different from a hypothesized population proportion.

    Signup and view all the flashcards

    Standard Deviation of Proportion

    A measure of variability for the difference between two sample proportions.

    Signup and view all the flashcards

    Level of Significance (α)

    The threshold for determining statistical significance, commonly set at 0.05, reflecting a 5% risk of concluding a difference exists when there is none.

    Signup and view all the flashcards

    Proportion Support Calculation

    Calculation to find the proportion of support for a candidate among different voter groups, used in hypothesis testing.

    Signup and view all the flashcards

    Study Notes

    STA301 - Statistics and Probability

    • Sample Size: Small samples have a size of 30 or less; large samples have a size greater than 30.

    • Significance Level: The significance level is the criterion for rejecting the null hypothesis. Common significance levels are 5% (0.05) and 1% (0.01).

    • Significance Level's Purpose: Shows the likelihood a result is due to chance.

    • Sampling Distribution's Mean: Approaches a normal distribution with a mean of μ and variance σ²/n as the sample size increases (Central Limit Theorem).

    • Constant vs. Random Variable: A variable is constant if its value doesn't change once assigned a value. A random variable changes when values are assigned.

    • Unbiased Estimator: An estimator is unbiased if its expected value equals the true value of the parameter being estimated. ( E(θ) = θ)

    • Quartile Deviation: Half the difference between the third quartile (Q3) and the first quartile (Q1). (Q.D = (Q3 - Q1)/2)

    • Sampling Error: The difference between the sample mean and the population mean. (Sampling error = X - µ)

    • Outcome vs. Event: An outcome is a result of a single trial, an event is an individual outcome or a group of outcomes.

    • Poisson Distribution: Mean and variance are equal in a Poisson distribution. However, this isn't always the case with real-world data.

    • Null and Alternative Hypothesis Example:

      • Null Hypothesis (H0): µ ≤ 16000 (Automobile driven no more than 16000 km per year)
      • Alternative Hypothesis (H1): µ > 16000
    • Normal Distribution Properties: Absolutely symmetrical, asymptotic to the x-axis, and µ₁ = 0.

    • Least Significant Difference (LSD) Test: A procedure to find the smallest difference judged significant when comparing means of multiple groups.

    • Degrees of Freedom in F-distribution: (v₁ and v₂)

    • Continuous vs. Discrete Data Example:

      • Number of passengers: Discrete
      • Hourly temperature: Continuous
      • Inches of rainfall: Continuous
      • Height measurements: Discrete
    • Example of Sample Space for Rolling Two Dice: (1,1), (1,2). (1,3) ... (6,6). A= {(1,6), (2,5), (3,4), (4,3), (5,2), (6,1)} This represents a set of possible outcomes when rolling two dice where the sum is 7

    • Sampling Error Formula: Sampling error = sample mean - population mean

    • Hypergeometric Distribution Usage: Approximating the hypergeometric distribution.

    • Pooled Proportion Formula: Pc = (n₁P₁ + n₂P₂)/(n₁ + n₂).

    • Class Interval Calculation: Number of classes = Range/Class interval

    • Coefficient of Variance (CV) Calculation: CV = (Standard Deviation/Mean) * 100

    • Poisson Distribution: If n is small and p is small, the Poisson distribution can be used to approximate the hypergeometric distribution for easier calculation, given n < 0.05N , n > 20 and p < 0.05

    • Statistical Test: A statistic (a value derived from data) used to test a hypothesis.

    • Parameter vs. Statistic: A parameter is a property of a population. A statistic is a property of a sample.

    • Standard Error: The standard deviation of a sampling distribution. If the population from which samples were drawn is normal, then the sampling distribution of means is also normal.

    • Level of Significance: The probability of rejecting a true null hypothesis. A value like 0.05 (5%).

    • Natural Pairing: Observation taken twice from the same unit for example weights of recruits before (X) and after (Y) the same physical program. They are dependent on each other, making the data collected related.

    • Five-Number Summary: Includes minimum, first quartile, median, third quartile, and maximum.

      • Provides a snapshot of data distribution.
    • Chi-Square: A measure for comparing observed frequencies to the expected frequencies for different data categories (especially when studying the relationship between categorical variables).

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Related Documents

    Description

    Test your understanding of key statistics and probability concepts in STA301. This quiz covers topics like sample sizes, significance levels, central limit theorem, and unbiased estimators. Perfect for reinforcing foundational knowledge in statistics.

    More Like This

    Use Quizgecko on...
    Browser
    Browser