Statistics II Exam Study Guide PDF

STATISTICS II @2024-12-06T18:23:45.228Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) Statistics for CSAI II – Exam Study Guide 1. Basics of Probability Probabilities form the basis for statistical inference. We use inferential statistics to answer questions about how representative our data are of the population. The difference between statistics and probability:  Probability  starting with an animal, and figuring out what footprints it will makes  Statistics  seeing a footprint, and guessing the animal. A M IG Frequentists vs Bayesians Frequentists Frequentists define probability as long-run frequency. For example, if we have a fair coin(50% probability of landing heads) we expect that the half number of experiments will land on EN heads. Frequentists require: - Data - Model(s) - Design Advantages & disadvantages of the frequentist view of probability Advantages Disadvantages It is objective: the probability of an event is Infinite sequences don’t exist in the physical necessarily grounded in the world. world. It is unambiguous: any two people watching The frequentist view has a narrow scope: the same sequence of events unfold, trying There are lots of things out there that human to calculate the probability of an event, must beings are happy to assign probability to in inevitably come up with the same answer. everyday language, but cannot (even in theory) be mapped onto a hypothetical @2024-11-16T00:25:50.266Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z (user_2ksOhvQ9KVFU5Q587CBwvbPxjsL) @2024-11-16T00:25:50.266Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z (user_2ksOhvQ9KVFU5Q587CBwvbPxjsL) sequence of events, e.g. weather forecast. Frequentist probability forbids us from making probability statements about a single event Bayesian view Bayesian view of probability is also called the subjectivist view. It defines probability as the degree of belief that an intelligent and rational agent assigns to that truth of that event. From that perspective, probabilities don't exist in the world, but rather in the thoughts and assumptions of people and other intelligent beings. Example of operationalising the “degree of belief” Suppose that I believe that there's a 60% probability of rain tomorrow. If someone offers me a bet: if it rains tomorrow, then I win $5, but if it doesn't rain then I lose $5. Clearly, from my A perspective, this is a pretty good bet. On the other hand, if I think that the probability of rain is only 40%, then it's a bad bet to take. Thus, we can operationalise the notion of a subjective probability" in terms of what bets I'm willing to accept. Bayesianists require: - Prior information M - Data IG - Model(s) - Design Advantages & disadvantages of the Bayesian view of probability EN Advantages Disadvantages It allows you to assign probabilities to any We cannot be purely objective: specifying a event you want to. probability requires us to specify an entity that has the relevant degree of belief. The Bayesian view is sometimes thought to be too broad (allows too many differences between observers) Probability distributions Elementary events – for a given observation, the outcome will be one and only one of these events. For example: In tossing a coin, E = event of getting a head, F = event of getting a tail are both elementary events. @2024-11-16T00:25:50.266Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z (user_2ksOhvQ9KVFU5Q587CBwvbPxjsL) @2024-11-16T00:25:50.266Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z (user_2ksOhvQ9KVFU5Q587CBwvbPxjsL) In throwing a die, A = event of getting 5, is an elementary event while B = event of getting an even number, is not an elementary event because its favourable outcomes are 2, 4, 6 (three outcomes). Example from the lecture: A Elementary event: probability of getting e.g. blue jeans M Non elementary event: probability of wearing jeans – sum of probabilities of blue jeans, black jeans and green jeans IG Sample space – the set of total possible events Binomial distribution EN In binomial distribution “either something is or isn’t”, you can get a 0 or 1. θ - success probability, e.g. the probability that a single die comes up skulls N – number of observations or size parameter, e.g. number of dice rolls X – generated randomly from the distribution; it refers to the results of our experiment, e.g. the number of skulls I get when I roll the dice. Since the actual value of X is due to chance, we refer to it as a random variable Example: The quantity that we want to calculate is the probability that X = 4 given that we know that θ =.167 and N = 20. The general “form" of the thing I'm interested in calculating could be written as @2024-11-16T00:25:50.266Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z (user_2ksOhvQ9KVFU5Q587CBwvbPxjsL) @2024-11-16T00:25:50.266Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z (user_2ksOhvQ9KVFU5Q587CBwvbPxjsL) 2. Relationship between models and data Data = Model + Error Comparison (relationship) vs Prediction (predicted outcomes vs predicting relationship vs using the model for prediction) 3. Using the different distribution functions in R including binomial and normal (e.g., dbinom, rnorm, etc.) Binomial distribution dbinom(x, size, prob) x – a number, or vector of numbers, specifying the outcomes whose probability you’re trying to calculate size – a number telling R the size of the experiment A prob – the success probability for any one trial in the experiment Probability distributions implemented in R M IG EN  The d form: you specify a particular outcome x, and the output is the probability of obtaining exactly that outcome (the “d” is short for density)  The p form: calculates the cumulative probability. You specify a particular quantile q, a number tells you the probability of obtaining an outcome smaller than or equal to q  The q form calculates the quantiles of the distribution. You specify a probability value p, and it gives you the corresponding percentile. That is, the value of the variable for which there’s a probability p of obtaining an outcome lower than that value.  The r form is a random number generator: it generates n random outcome from the distribution 4. Characteristics of the normal distribution The normal distribution (or Gaussian distribution) is described using two parameters, the mean of the distribution μ and the standard deviation of the distribution σ. The notation that we sometimes use to say that a variable is normally distributed is as follows: @2024-11-16T00:25:50.266Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z (user_2ksOhvQ9KVFU5Q587CBwvbPxjsL) @2024-11-16T00:25:50.266Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z (user_2ksOhvQ9KVFU5Q587CBwvbPxjsL) Characteristics of normal distribution - The area under the curve of normal distribution must equal 1 - The mean, mode and median are all equal. - The curve is symmetric at the center (i.e. around the mean, μ). - Exactly half of the values are to the left of center and exactly half the values are to the right. - The standard deviation controls the spread of the distribution. A smaller standard deviation indicates that the data is tightly clustered around the mean; the normal distribution will be taller. A larger standard deviation indicates that the data is spread out A around the mean; the normal distribution will be flatter and wider. M IG EN Binomial vs normal distribution Binomial Normal Discrete Continuous The plot is “histogram-like” bars The plot is a smooth curve 5. Running and interpreting R output for correlations @2024-11-16T00:25:50.266Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z (user_2ksOhvQ9KVFU5Q587CBwvbPxjsL) @2024-11-16T00:25:50.266Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z (user_2ksOhvQ9KVFU5Q587CBwvbPxjsL) We can use cor.test() to test the null hypothesis that the correlation in the population is 0. We can also specify out alternative for whether there should be a negative relationship or positive association. It also provides p-values and CIs. cor.test(ads, packets) output: A The output tells us whether the correlation is different than zero or not. Although we can see strong correlation (r2 = 0.87), our confidence interval includes 0 (CI95 = (-0.048, 0.99)). The p- M value is above the acceptable value (p > 0.05). The t-statistic is far from the mean and since we want to know whether the value is greater or smaller than t, we need to include all the values greater than or smaller than t-value and therefore, we get the area of almost the whole curve. IG EN Hence, we should reject the null hypothesis. lsr::correlate(x, y) @2024-11-16T00:25:50.266Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z (user_2ksOhvQ9KVFU5Q587CBwvbPxjsL) @2024-11-16T00:25:50.266Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z (user_2ksOhvQ9KVFU5Q587CBwvbPxjsL) Non-parametric correlations A (some) assumptions of Pearson’s Correlation  Interval or ratio scale data  Normally distributed  Spearman’s rho M If the assumptions are broken we can then use: o Pearson’s correlation on the ranked data IG  Kendall’s tau o Better than Spearman’s for small samples o For Pearson’s correlation a good size for a sample is at least 30. EN 6. Sample statistics and population parameters @2024-11-16T00:25:50.266Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z (user_2ksOhvQ9KVFU5Q587CBwvbPxjsL) @2024-11-16T00:25:50.266Z By (user_2mC1XrrcCLu6v9KKwFId7MFjRmw) @2024-12-06T18:23:45.228Z (user_2ksOhvQ9KVFU5Q587CBwvbPxjsL) A M IG 7. Running and interpreting R output for simple linear regression albumSales.1

Statistics II Exam Study Guide PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue