Descriptive Statistics PDF
Document Details
Uploaded by ImmaculateSaxhorn
Tags
Summary
This document explains descriptive statistics, including measures of central tendency (mean, median, mode) and variability (range, variance, standard deviation). It also covers concepts like normal distributions, z-scores, and confidence intervals.
Full Transcript
- Statistics are used to make sense of numbers in research - Statistics can help make inferences and generalize results to populations - Population measures are parameters, sample measures are statistics - Each statistical test yields a p value - If p \< 0.05, you reject the nu...
- Statistics are used to make sense of numbers in research - Statistics can help make inferences and generalize results to populations - Population measures are parameters, sample measures are statistics - Each statistical test yields a p value - If p \< 0.05, you reject the null - If p \> 0.05, you do not reject/accept the null - Roll of one dice a bunch of times results in uniform distribution, ⅙ probability of rolling any number every time - Sum of rolling two dice a bunch of times results in normal distribution with a bell-shaped curve shape - Frequency distributions are a way to summarize data - Population mean is μ, sample mean is x̄ - Population st dev is σ, sample st dev is s - Measures of central tendency are mean, median, mode - Mean: add all scores and divide by number of scores, also called average - Median: middle number in a rank-ordered data set or average of the two middle values, divides the data into equal halves, often more reliable than mean for salary and household income because doesn't include extremes - Mode: value that occurs most frequently, easiest to determine by looking at freq distribution - Measures of variability are range, percentiles, variance, standard deviation, coefficient of variation - Range: subtract highest value minus lowest value - Percentile: score's position within a distribution, converts actual score to comparative, provides reference point - Variance: variation within a full set of scores - Standard deviation: measure of the amount of variation/dispersion/spread of a set of values - Coefficient of variation: ratio of standard deviation to the mean, needd when comparing different units - Many biological, psychological, and social phenomena are normally distributed, but non-normal data can still be valid. - Z scores are standardized scores - You can get percentile from z score in normal distribution - Standard error of the mean: standard deviation of a theoretical sampling distribution of the mean - Estimate is based on st. dev and sample size - As sample size increased, SEM decreases because getting closer to pop - SEM forms basis for confidence intervals - Confidence interval is a range of scores within specific boundaries - Degree of confidence is often represented by percentages (90%, 95%, 99%) - You can use the z score associated with the percentage - 90% = 1.645 - 95% = 1.96 - 99% = 2.58 - CI are more accurate than mean +/- standard deviation - Statistical inferences allow estimation of population characteristics from sample data - Assumptions are based on - Probability: likelihood any one event will occur given all possible outcomes - Sampling error: tendency for sampling values to differ from population values - Hypothesis testing: estimation of population parameters is only one part of statistical inference. We want to see is one treatment more effective than another - Null hypothesis: stating the group means are not different or the same and that they come from the same population (=) - Goal is to test Ho - Never actually PROVE Ho - Purpose is to give data a chance to disprove - Alternative hypothesis: stating that the observed differences between parameters are not due to chance and they are different, can be directional or non-directional (not = to, greater or less than) - Type I Error: false positive, the researcher rejects the null when it is true, denoted by alpha, level of significance is 0.05 - Type II Error: false negative, the researcher does not reject the null when it is false, denoted by B, usually 20% acceptable, statistical power (1-B), sensitivity, the more sensitive a test is, the more likely it will detect important clinical differences, power is a function of alpha, variance, sample size, and effect size, design of a test procedure - Parametric statistics: used to estimate population parameters, assumes samples are randomly drawn from normal pops, homogenous variances, and scores are subject ot arithmetic manipulation - Non-parametric statistics: when parametric assumptions cannot be used, works better with smaller samples, can use measures like median, do not assume a distribution most importantly -