STAT3301 Lectures 1-5 Complete Notes PDF

Document Details

DazzledButtercup

Uploaded by DazzledButtercup

University of Minnesota

2024

Aaron J. Molstad

Tags

random variables probability distributions statistics mathematics

Summary

This document presents comprehensive lecture notes on random variables and probability distributions, covering discrete and continuous variables, uniform, Bernoulli, binomial, and normal distributions. It also discusses estimators, simulation studies, and models.

Full Transcript

STAT3301: Lectures 1–5 Complete Notes Aaron J. Molstad 2024-09-18 Contents Models, random variables, and realizations of random variables...

STAT3301: Lectures 1–5 Complete Notes Aaron J. Molstad 2024-09-18 Contents Models, random variables, and realizations of random variables 1 More review of random variables 2 Discrete and continuous.......................................... 2 A discrete random variable example................................... 3 Expected value and variance....................................... 4 A discrete random variable example continued............................. 5 Independent and identically distributed random variables and parameters.............. 6 Named distributions for random variables 6 The Uniform distribution......................................... 6 Definition and properties....................................... 6 Drawing from the Uniform distribution in R............................ 7 The Bernoulli and Binomial distributions................................ 8 Definitions and properties...................................... 8 Drawing from the Bernoulli and Binomial distributions in R................... 8 The Normal distribution.......................................... 9 Definition and properties....................................... 9 Drawing from the Normal distribution in R............................ 10 Using random variables in simple models 10 An example using the Normal distribution to model a continuous characteristic........... 10 An example using the Bernoulli distribution to model a categorical characteristic.......... 11 Estimators and estimates 12 Estimating the mean/expected value and variance of a probability distribution........... 12 A simulation study to illustrate the distribution of the sample mean 13 Some facts about the sample mean.................................... 13 Simulation study with the Uniform distribution............................. 13 Models, random variables, and realizations of random variables Models Models are oversimplified and approximate representations of real world objects or phenomena. A model’s quality is judged by comparing its predictions or inferences to real world observations. Based on STAT3301 lecture notes by Adam J. Rothman 1 Statistical models involve mathematical objects called random variables. Random variables A random variable is a numerical measurement of the outcome of an experiment that has yet to be performed. For example, let 𝑌 be the amount of time in minutes it will take me to drive home today. Then 𝑌 is a random variable. Realization of a random variable A realization of a random variable is the value the random variable took on after the experiment was performed. For example, as I write this sentence, I am now at home and it took me 𝑦 = 42.61 minutes to drive here. The quantity 42.61 is a realization of 𝑌. Most statistical analyses of data assume that some of the measurements in a dataset are realizations of random variables. This is an abstract idea, but we can gain intuition by generating artificial data ourselves with R’s pseudo random number generator. We will introduce this in these notes. More review of random variables Discrete and continuous A random variable is discrete if its set of possible values has a finite or countable number of elements. e.g. {0, 1}, {−1.5, 0, 1.5, 2, 3}, {1, 2, 3, …}. For example, we plan to roll a die 4 times, let 𝑋 be the number of rolls where the die shows an even number. The set of possible values for 𝑋 is {0, 1, 2, 3, 4}. Then 𝑋 is a discrete random variable. A random variable is continuous if its set of possible values has an uncountably infinite number of elements and the probability of the event that the random variable equals 𝑐 is zero for all 𝑐 ∈ ℝ. The set of possible values is typically an interval of real numbers, e.g. [0, 1], (−∞, ∞). For example, let 𝑋 be the exact height of a person we plan to randomly select. The set of possible values of 𝑋 could be the interval (5, 110) inches. Then 𝑋 is a continuous random variable. Continuous random variables are considered approximations to reality because we can only measure on a discrete scale. Probability mass function The probability mass function (pmf) for a discrete random variable 𝑋 is the function 𝑝 ∶ ℝ → ℝ tcolorboxed by 𝑝(𝑥) = 𝑃 (𝑋 = 𝑥), 𝑥 ∈ ℝ. Probability density function The probability density function (pdf) for a continuous random variable 𝑋 is the function 𝑓 ∶ ℝ → ℝ for which 𝑓(𝑥) ≥ 0 for all 𝑥 ∈ ℝ, ∞ ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1 𝑏 𝑃 (𝑎 ≤ 𝑋 ≤ 𝑏) = 𝑃 (𝑎 < 𝑋 < 𝑏) = 𝑃 (𝑎 ≤ 𝑋 < 𝑏) = 𝑃 (𝑎 < 𝑋 ≤ 𝑏) = ∫𝑎 𝑓(𝑥)𝑑𝑥. 2 Cumulative distribution function The cumulative distribution function (cdf) for a random variable 𝑋 is the function 𝐹 tcolorboxed by 𝐹 (𝑡) = 𝑃 (𝑋 ≤ 𝑡), 𝑡 ∈ ℝ. Both discrete and continuous random variables have a cdf. A discrete random variable example The experiment is for a person to shoot two basketball freethrows. Let 𝑌 be the yet-to-be observed number that this person would make. Since 𝑌 is a numerical measurement of the outcome of an experiment (that has yet to performed), 𝑌 is a random variable. The set of possible values for 𝑌 is {0, 1, 2}, which has a finite number of elements, so 𝑌 is a discrete random variable. Suppose that 𝑌 has pmf 𝑝 ∶ ℝ → [0, 1] tcolorboxed by 𝑦=0 𝑦=1 𝑦=2 𝑦 ∉ {0, 1, 2} 𝑝(𝑦) 0.25 0.5 0.25 0 We can graph 𝑌 ’s pmf with pmfY

Use Quizgecko on...
Browser
Browser