Summary

This document provides a summary of statistical analysis of numerical data. The document covers topics such as descriptive statistics, measures of central tendency, variance, and standard deviation, along with ways to visualize and summarize the data.

Full Transcript

2/12/25, 11:55 AM OneNote Examining samples Numerically Tuesday, January 28, 2025 9:32 AM Examining samples: You have your data, now need to look at it- does it look ok? ○ Possib...

2/12/25, 11:55 AM OneNote Examining samples Numerically Tuesday, January 28, 2025 9:32 AM Examining samples: You have your data, now need to look at it- does it look ok? ○ Possible outliers, make initial comparisons among samples, what kinds of statistical analyses can be applied Produce statistics to summarize our data ○ Sample statistics are estimators of population parameters What if we visualize the data with a dot-plot ○ Clustered or evenly distributed? ○ Arranging frequency of each value is called frequency distribution Descriptive statistics: N=sample size (count of observations in your sample) Measures of central tendency (or location) ○ Where are your data situated on the scale ○ Mean (average)= sum of values/n ○ Median = variate with equal # variates above and below it ○ Mode= most common variate (for discrete data) Mean: Median (M, Md, or x): https://uvic-my.sharepoint.com/personal/lucybryden_uvic_ca/_layouts/15/Doc.aspx?sourcedoc={9faa0b91-e898-498c-a6e1-c5e15b84c2ef}&action=edit&wd=target… 1/5 2/12/25, 11:55 AM OneNote Why use Median instead of Mean? We have an asymmetrical distribution ○ Non-normal-½ values above, ½ below the median Extreme values Descriptive statistics: Mode (Mo) Most frequent value Looking at our frequency distribution of test scores Mode is not commonly used for continuous data Discrete data: ○ Mean and median are meaningless ○ Mode is useful Measures of dispersion: Continuous data: Measures od dispersion- tell us about shape of distribution ○ Rande, interquartile range- smallest to largest values across whole or part of distribution Standard deviation, variance, coefficient of variation- reference amount of variation in sample Range: Largest-smallest number https://uvic-my.sharepoint.com/personal/lucybryden_uvic_ca/_layouts/15/Doc.aspx?sourcedoc={9faa0b91-e898-498c-a6e1-c5e15b84c2ef}&action=edit&wd=target… 2/5 2/12/25, 11:55 AM OneNote Also underestimates population range- how likely is it that the sample contains largest and smallest values from population Interquartile Range Mini-range between 1st and 3rd quartiles ○ Focus on variation at the center of distribution (middle 50%) and excludes outliers ○ Quartile- divides data points into 4 equal parts ○ Q1, Q2, Q3= data point at each division Variance and Standard Deviation: Variance and standard deviation tell us about how clumped or dispersed the data are Variance(s2): Measures average deviations of variates from mean ○ But: negative and positive deviations would cancel out, therefore square deviations ○ The sample statistic is s^2, population parameter is sigma ○ Standard Deviations (s, sd, or SD): Problem with variance, units of mesurement are squared, not intuitive, so take square root of variance = standard deviation https://uvic-my.sharepoint.com/personal/lucybryden_uvic_ca/_layouts/15/Doc.aspx?sourcedoc={9faa0b91-e898-498c-a6e1-c5e15b84c2ef}&action=edit&wd=target… 3/5 2/12/25, 11:55 AM OneNote Coefficent of variation (CV): Sd and s^2 are influecned by size of mesure, so cannot compare small to large measurements CV adjusts sd as porportion of Y https://uvic-my.sharepoint.com/personal/lucybryden_uvic_ca/_layouts/15/Doc.aspx?sourcedoc={9faa0b91-e898-498c-a6e1-c5e15b84c2ef}&action=edit&wd=target… 4/5 2/12/25, 11:55 AM OneNote https://uvic-my.sharepoint.com/personal/lucybryden_uvic_ca/_layouts/15/Doc.aspx?sourcedoc={9faa0b91-e898-498c-a6e1-c5e15b84c2ef}&action=edit&wd=target… 5/5

Use Quizgecko on...
Browser
Browser