IMG_2516.PNG
Document Details

Uploaded by SaintlyRisingAction5215
Full Transcript
# Statistical Inference ## Point Estimation ### Definition An estimator is a rule or formula that tells us how to calculate the value of an estimate based on the data in a sample. ### Examples - Sample mean $\bar{X}$ is a point estimator of population mean $\mu$. - Sample variance $S^2$ is a poin...
# Statistical Inference ## Point Estimation ### Definition An estimator is a rule or formula that tells us how to calculate the value of an estimate based on the data in a sample. ### Examples - Sample mean $\bar{X}$ is a point estimator of population mean $\mu$. - Sample variance $S^2$ is a point estimator of population variance $\sigma^2$. - Sample proportion $\hat{p}$ is a point estimator of population proportion $p$. ### Desirable Properties of Estimators - Unbiasedness: An estimator is unbiased if its expected value is equal to the parameter it is trying to estimate. $E(\hat{\theta}) = \theta$ - Efficiency: An estimator is efficient if it has a small variance. - Consistency: An estimator is consistent if it converges to the true value of the parameter as the sample size increases. $\hat{\theta} \xrightarrow{p} \theta$ as $n \rightarrow \infty$ ## Method of Moments Estimation (MME) A method of estimating population parameters by equating sample moments to population moments and then solving the equations for the parameters. ### Population Moments Population moments are the expected values of powers of the random variable $X$. $\mu_1 = E(X)$ $\mu_2 = E(X^2)$ $\mu_3 = E(X^3)$... $\mu_k = E(X^k)$ ### Sample Moments Sample moments are the averages of powers of the data in a sample. $m_1 = \frac{1}{n} \sum_{i=1}^n X_i$ $m_2 = \frac{1}{n} \sum_{i=1}^n X_i^2$ $m_3 = \frac{1}{n} \sum_{i=1}^n X_i^3$... $m_k = \frac{1}{n} \sum_{i=1}^n X_i^k$ ### Steps for MME 1. Calculate the population moments in terms of the parameters. 2. Calculate the sample moments. 3. Equate the population moments to the sample moments. 4. Solve the equations for the parameters. ### Example Let $X_1, X_2,..., X_n$ be a random sample from a population with mean $\mu$ and variance $\sigma^2$. Estimate $\mu$ and $\sigma^2$ using MME. **Solution** 1. Population moments: $\mu_1 = E(X) = \mu$ $\mu_2 = E(X^2) = Var(X) + [E(X)]^2 = \sigma^2 + \mu^2$ 2. Sample moments: $m_1 = \frac{1}{n} \sum_{i=1}^n X_i = \bar{X}$ $m_2 = \frac{1}{n} \sum_{i=1}^n X_i^2$ 3. Equate population moments to sample moments: $\mu = \bar{X}$ $\sigma^2 + \mu^2 = \frac{1}{n} \sum_{i=1}^n X_i^2$ 4. Solve for the parameters: $\hat{\mu} = \bar{X}$ $\hat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^n X_i^2 - \bar{X}^2 = \frac{1}{n} \sum_{i=1}^n (X_i - \bar{X})^2$ ## Maximum Likelihood Estimation (MLE) ### Definition A method of estimating the parameters of a probability distribution by maximizing a likelihood function, so that under the assumed statistical model, the observed data is most probable. ### Likelihood Function The likelihood function is the probability of observing the data given the parameters. $L(\theta) = P(X_1 = x_1, X_2 = x_2,..., X_n = x_n | \theta)$ ### Steps for MLE 1. Write the likelihood function. - If $X_i$ are i.i.d. $L(\theta) = \prod_{i=1}^n f(x_i | \theta)$ where $f(x_i | \theta)$ is the probability mass function (PMF) or probability density function (PDF) of $X_i$ 2. Take the natural logarithm of the likelihood function. $\ell(\theta) = \ln L(\theta) = \sum_{i=1}^n \ln f(x_i | \theta)$ 3. Take the derivative of the log-likelihood function with respect to the parameter(s) and set equal to zero. $\frac{d\ell(\theta)}{d\theta} = 0$ 4. Solve for the parameter(s). The solution is the MLE. 5. Verify that the solution is a maximum by checking the second derivative. $\frac{d^2\ell(\theta)}{d\theta^2} < 0$ #### Example Let $X_1, X_2,..., X_n$ be a random sample from a Bernoulli distribution with parameter $p$. Estimate $p$ using MLE. **Solution** 1. Likelihood function: $L(p) = \prod_{i=1}^n p^{x_i} (1-p)^{1-x_i} = p^{\sum_{i=1}^n x_i} (1-p)^{n - \sum_{i=1}^n x_i}$ 2. Log-likelihood function: $\ell(p) = \ln L(p) = (\sum_{i=1}^n x_i) \ln p + (n - \sum_{i=1}^n x_i) \ln (1-p)$ 3. Take the derivative and set equal to zero: $\frac{d\ell(p)}{dp} = \frac{\sum_{i=1}^n x_i}{p} - \frac{n - \sum_{i=1}^n x_i}{1-p} = 0$ 4. Solve for $p$: $\hat{p} = \frac{\sum_{i=1}^n x_i}{n} = \bar{x}$ 5. Verify that the solution is a maximum: $\frac{d^2\ell(p)}{dp^2} = -\frac{\sum_{i=1}^n x_i}{p^2} - \frac{n - \sum_{i=1}^n x_i}{(1-p)^2} < 0$ ## Interval Estimation ### Definition An interval estimate is a range of values that is likely to contain the true value of the population parameter with a certain level of confidence. ### Confidence Interval #### Definition A confidence interval is an interval estimate of a population parameter, along with a probability that the interval contains the true parameter value. #### Confidence Level The confidence level is the probability that the interval contains the true parameter value. $1 - \alpha$ where $\alpha$ is the significance level. #### Margin of Error The margin of error is the amount added and subtracted from the point estimate to obtain the confidence interval. #### General Formula Point Estimate $\pm$ (Critical Value) * (Standard Error) ### Confidence Interval for Population Mean #### Known Variance $\sigma^2$ $\bar{X} \pm Z_{\alpha/2} \frac{\sigma}{\sqrt{n}}$ Where: - $\bar{X}$ is the sample mean - $Z_{\alpha/2}$ is the critical value from the standard normal distribution - $\sigma$ is the population standard deviation - $n$ is the sample size #### Unknown Variance $\sigma^2$ $\bar{X} \pm t_{\alpha/2, n-1} \frac{s}{\sqrt{n}}$ Where: - $\bar{X}$ is the sample mean - $t_{\alpha/2, n-1}$ is the critical value from the t-distribution with n-1 degrees of freedom - $s$ is the sample standard deviation - $n$ is the sample size ### Confidence Interval for Population Proportion $\hat{p} \pm Z_{\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$ Where: - $\hat{p}$ is the sample proportion - $Z_{\alpha/2}$ is the critical value from the standard normal distribution - $n$ is the sample size ### Determining Sample Size #### For estimating population mean $n = (\frac{Z_{\alpha/2} \sigma}{E})^2$ Where: - $E$ is the desired margin of error #### For estimating population proportion $n = (\frac{Z_{\alpha/2}}{E})^2 \hat{p}(1-\hat{p})$ Where: - $\hat{p}$ is a prior estimate of the population proportion. If no prior estimate is available, use $\hat{p} = 0.5$ - $E$ is the desired margin of error