PSY 201 Introduction to Statistics for Psychology I Lecture Slides PDF
Document Details
Uploaded by Deleted User
Boğaziçi University
2022
Prof. Güneş Ünal
Tags
Summary
These lecture slides cover the basic concepts of statistics for psychology, focusing on variables, measurement scales, data organization, and research design. The document also introduces independent and dependent variables with examples, and different research design types.
Full Transcript
Course introduction PSY 201 Introduction to Statistics for Psychology I Dr Nihan Albayrak-Aydemir 1 Research methods in psychology PSY 201 Introduction to Statistics for Psychology I Dr Ni...
Course introduction PSY 201 Introduction to Statistics for Psychology I Dr Nihan Albayrak-Aydemir 1 Research methods in psychology PSY 201 Introduction to Statistics for Psychology I Dr Nihan Albayrak-Aydemir 1 Understanding Data in Psychological Research A Lecture on Variables, Measurement, and Data Organization Teaching Assistant: Saliha Erman, MA Source: Adapted from Assoc. Prof. Güneş Ünal, Boğaziçi University, 2022 Variable A variable is any attribute of objects, people or events that, within the context of a particular investigation, can take on different values. i.e. height, reaction time, test score, eye colour, etc. The opposite of a variable is a constant. A constant is any attribute of objects, people, or events that, within the context of a particular investigation, has a fixed value. An attribute can be treated as a variable in one context and as a constant in another i.e. The study of behaviour in 3-year-old children (age is a constant) The study of cognitive development in 2-, 4-, and 6-year-olds (age is a variable) Within-Subject vs. Between-Subject Variance Change and variation can be seen in within and between individuals Example: Mood Within-subject variation Between-subject variation Preview: Types of variables Variables Based on Based on Research Measurement Design Qualitative Quantitative Independent Dependent (Categorical) (Continuous) Nominal Scale Ordinal Scale Interval Scale Ratio Scale Measurement Levels Types of Variables Qualitative (categorical) variables take a value that is one of several possible categories.. The values of qualitative variables are categories (Although sometimes they can be represented by numbers, these numbers serve as labels only) They express differences in KIND, not amount. Examples: College attended, fruit types, nationality, blood type etc. Quantitative (continuous) variables are numerical variables such as height, decision time, GPA, age, or proportion of items correctly answered. The values of quantitative variables are numbers. They express differences in AMOUNT. Levels of Measurement The level that «variables» are «measured» are divided into four «measurement scales». The lowest two levels usually belong to categorical/qualitative variables: 1) Nominal Scale: A measurement scale used to categorize or label variables without any quantitative value or order. The categories on a nominal scale represent different groups or types, but there is no inherent ranking or hierarchy among them. i.e. Dexterity: left-handed, right-handed (a dichotomous variable with only two options) i.e. Favorite ice cream flavour: vanilla, chocolate, strawberry etc. (polytomous variable with more than two options) 2) Ordinal Scale: It has the properties of a nominal scale, but in addition the observations may be ranked in order of magnitude (with nothing implied about the difference between adjacent steps on the scale) i.e. Position finished in a race. (Nothing is implied about the absolute level of merit. The difference between the rankings is unknown.) Levels of Measurement The level that «variables» are «measured» are divided into four «measurement scales». The highest two levels usually belong to continuous/qualitative variables: 3) Interval Scale: Has all the properties of an ordinal scale, and a given distance between measures has the same meaning anywhere on the scale. Also called an equal-interval scale. i.e. Degrees of temperature, calenders years (2002, 2024 etc.) For interval scales -Celcius scale, for example- the value of zero has an arbitrary reference point (i.e. the freezing point of water ➔ and does not imply an absence of heat). 4) Ratio Scale: Has all the properties of an interval scale plus an absolute zero point, which allows for the comparison of ratios (e.g., 10 kg is twice as much as 5 kg) i.e. Length, weight, reaction time, dollars ➔ e.g., zero dollar means absence of it. Caution on Ratio Scale: On the interval level of measurement, you cannot make conclusions on ratios i.e. 20°C is not twice as hot than 10°C (because of the arbitrary starting point) Caution on Ambiguities in classifying a level of measurement There can be some cases where researchers have conflicting attitudes towards some measurement types. i.e. Likert type scaling is typically used with options like '1 to 5' or '1 to 7', ordered from 'completely disagree' to 'completely agree' (or from 'never' to 'always', etc.). It is considered an ordinal scale because the intervals between points are not necessarily equal. However, in some analyses, Likert scales may be treated as continuous variables for the purpose of certain statistical techniques. Independent and Dependent Variables The independent variable (IV) is the variable that is manipulated or changed by the experimenter. The dependent variable is measured under each condition / level of the independent variable. Independent variable (DV) is sometimes called the explanatory variable or predictor variable. It is the factor that is expected to explain the changes in the dependent variable. The dependent variable is sometimes called the response variable or outcome variable. We want to explain the variability in this variable. independent variable dependent variable The effect of on i.e. the effect of psychological stress on blood pressure independent variable = the amount of psychological stress dependent variable = blood pressure Independent and Dependent Variables Example: Whether the daily intake of vitamin B improves memory performance Vitamin B: 800µg pill or no pill (0g) Memory performance: Number of correctly recalled items from a word list independent variable dependent variable The effect of on >> Does the presence of vitamin B intake make a difference? Independent variable: Vitamin B intake, a qualitative/categorical variable measured at a nominal level. Dependent variable: Number of words, a quantitative/continuous variable measured at a ratio level. Independent and Dependent Variables Dependent variables are most commonly quantitative/continuous: i.e. Number of words recalled But they can also be categorical: i.e. voting intention (candidate A, candidate B or candidate C) behaviour (smoke vs don’t smoke) Research Design: Natural and Manipulated Independent Variables Observing variability Creating variability through naturally experimental manipulation i.e. age, SES, IQ i.e. drug dose, stimulus type With natural I V s the The experimenter has complete experimenter does not have control over the variable and the complete control over the assignment of participants to variable and manipulates it conditions (such as random through a process of selection sampling) Natural Selection Manipulated + Random + variable variable assignment Quasi Experiment True Experiment The true experiment allows one to impute causality, while the quasi experiment often allows one to report a correlation Research Design: Natural and Manipulated Independent Variables Observing variability Creating variability through naturally experimental manipulation i.e. age, SES, IQ i.e. drug dose, stimulus type e.g., the effect(!)* of IQ on e.g., the effect of vitamin B memory performance intake on memory performance *But can you really talk about an «effect» if it is not manipulated? Natural Selection Manipulated + Random + variable variable assignment Quasi Experiment True Experiment Only the true experiment allows one to impute causality. Correlations Between Variables In some studies, there is no clear cause and effect relationship between the variables. i.e. whether there is a relationship between measures of depression and anxiety. >> neither variable can be considered as a predictor or outcome variable. The effect of anxiety on depression The effect of depression on anxiety The relationship between depression and anxiety This is termed a correlational design. It is usually used with continuous data. Variable 2 Variable 1 Data Organization Organizing data for analysis is a vital step in any research project. Key steps include: 1.Data Entry: Entering raw data into a software tool (e.g., Excel, SPSS, Jamovi). 2.Data Coding: Assigning numerical or categorical codes to responses (e.g., 1 = Male, 2 = Female). 3.Cleaning Data: Removing or correcting errors, such as duplicate or incomplete entries. You will see more examples of this on next week’s Jamovi class! Data Organization Properly organized data ensures: 1.Efficiency: Efficiency in the analysis process, saving time and effort. 2.Accuracy: Reduces the risk of errors that could distort results. 3.Reproducibility: Well-organized data can be more easily verified or reanalyzed by other researchers. For now, lets look at a real life psychology research example: A PhD Research: Individual Differences Predicting Autobiographical Reasoning among Turkish Immigrants in Berlin Recap Variables Based on Based on Research Measurement Design Qualitative Quantitative Independent Dependent (Categorical) (Continuous) Nominal Scale Ordinal Scale Interval Scale Ratio Scale Measurement Levels THANK YOU! Describing Data A Lecture on Frequency Distributions, Central Tendency and Dispersion Statistics Teaching Assistant: Saliha Erman, MA Source: Adapted from Assoc. Prof. Güneş Ünal, Boğaziçi University, 2022 The Most Basic Statistical Concept: Data Data are the recorded set of observations from a scientific investigation. A single observation is a datum. Statistics is the science of classifying, organising and analysing data. It also refers to a set of methods underlying the gathering of data and their interpretation. The main goal is to describe regularities in the data. Describing Data vs. Making Inferences from the Data 1) Descriptive Statistics 2) Inferential Statistics Procedures used to summarise, Procedures used to allow organize and make sense of researchers to infer from or a set of scores or observations. generalize observations made within smaller samples to the Typically presented graphically, larger population in tabular forms (tables) etc. - Graphing the data - Aims to draw a general conclusion - Calculating means (averages) and - Inferring characteristics of populations other measures from characteristics of samples - Looking for extreme scores or oddly - Uses sampling shaped distributions of scores 1) Descriptive Statistics Descriptive Statistics 1) Health Check The use of graphs and summary statistics to capture the basic features of data and to ensure the data do not contain misleading features. Y >> Data points which are way out of line with the rest of the sample are called outliers >> Descriptive statistics help in the identification of outliers. outlier X 2) Communication Descriptive statistics is key to communicate data: display data in a way to make their important features comprehensible 3) Estimation Summary statistics computed on sample data serve as estimates of their corresponding values in the population (parameters) Descriptive Statistics Example: Let’s assume we are interested in how many hours students on campus do exercise per week: Student 1: 6 hrs / week Student 6: 1 hr / week Student 2: 3 hrs / week Student 7: 40 hrs / week Student 3: 9 hrs / week Student 8: 5 hrs / week Student 4: 2 hrs / week Student 9: 7 hrs / week Student 5: 11 hrs / week Student 10: 6 hrs / week 40 hrs / week >> unusually high relative to others. Important to determine why? >> clerical error? (i.e. typing error) unique subject? (i.e. licensed swimmer) If you decide to exclude, your research report should explicitly state this. Displaying Data as Frequency Distributions Let’s say we have a set of raw scores that we have directly measured Comparison set (either 1, 3 or 5 digits) Reaction times (seconds) for 3 9 7 comparison set of 3 digits: 1.1 1.3 1.0 1.0 1.3 1.1 0.9 1.3 1.2 0.9 0.8 1.3 1.0 1.3 1.3 0.8 0.9 1.4 0.8 0.9 Test digit 4 It is difficult to make sense out Time of raw scores. Whether the test digit had been included in the comparison set? Measure reaction time Displaying Data as Frequency Distributions Frequency distribution is a compilation of all of the score values and the number of times that each occurs. Frequency distributions are often presented in the form of frequency tables. Such tables list the scores on a variable and the number of individuals who obtained each value: RT f 1.5 1 1.4 1 Reaction times (seconds) for 1.3 6 comparison set of 3 digits: 1.2 0 1.1 2 1.1 1.3 1.0 1.0 1.3 1.1 0.9 1.3 1.0 3 1.5 0.9 0.8 1.3 1.0 1.3 1.3 0.8 0.9 4 0.9 1.4 0.8 0.9 0.8 3 Raw Scores Frequency Table f denotes frequency or the # of individuals who received each score Displaying Data as Frequency Distributions A simple frequency distribution shows the number of times each score occurs in a set of data. Frequency (f) is the # of times a score or a value occurs in a data set N is the total number of outcomes / values / scores. One way to see a distribution is in a table: 1.1 1.3 1.0 1.0 1.3 1.1 0.9 1.3 RT f 1.5 0.9 0.8 1.3 1.0 1.3 1.3 0.8 1.5 1 0.9 1.4 0.8 0.9 1.4 1 1.3 6 >> The sum of all frequencies in a 1.2 0 sample equals N. 1.1 2 1.0 3 0.9 4 0.8 3 Make a list down the page of each possible value, from highest to lowest Note that although no one had a RT of 1.2, we still included it Graphing a Simple Frequency Distribution A graph of a frequency distribution shows the scores / values on the X axis and their frequency on the Y axis. The type of measurement scale (nominal, ordinal, interval or ratio) determines whether we graph a frequency distribution as a histogram graph, a polygon or a bar chart. Frequency Histogram Used mostly for interval (i.e. Celcius) or ratio scores (i.e. time) 7 RT f 6 1.5 1 Frequency (f) 5 1.4 1 4 1.3 6 3 1.2 0 2 1.1 2 1 1.0 3 0 0.9 4 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 0.8 3 Reaction time (RT) in secs Frequency Histogram If a variable is continuous, the vertical boundaries of the bar for a score represent the real limits of that score: Lower real limit Upper real limit 0.85 0.95 0.8 0.9 1.0 real limits: those points that fall one-half a measurement unit below that number (lower real limit), one-half a measurement unit above that number (upper real limit) In continuous variables values should be thought in terms of their real limits. Frequency Histogram If a variable is continuous, the vertical boundaries of the bar for a score represent the real limits of that score: Grouping Scores If large data set and the scores vary widely, the individual values/scores are usually grouped into class intervals and presented as a frequency distribution of grouped scores. RT Intervals (sec) 2.5 – 2.9 2 - 2.4 1.5 – 1.9 1 – 1.4 0.5 – 0.9 0 – 0.4 Frequency (f) 2 3 20 45 25 5 Frequency (f) Reaction time (RT) in secs Guidelines for Frequency Table of Grouped Scores Intervals should not overlap, so no score can Weight (gr) f belong to more than one interval. 3800-3999 3 3600-3799 7 (Preferentially) make all intervals the same width. 3400-3599 5 If not, adjust the bar width accordingly. 3200-3399 4 3000-3199 0 Make the intervals continuous throughout the 2800-2999 2 distribution (even if an interval is empty). 2600-2799 1 2400-2599 2 Place the interval with the highest score at the top. 2200-2399 1 Choose a convenient interval width. Frequency Polygon A kind of line graph to depict a large number of interval or ratio scores when an histogram is hard to be made. i.e. the frequency of each weight of babies born in a hospital Weight (g) f 4000-4199 0 Frequency (f) 3800-3999 9 3600-3799 15 3400-3599 10 3200-3399 9 3000-3199 8 2800-2999 6 2600-2799 6 2400-2599 3 Weight at Birth (g) 2200-2399 3 2000-2199 0 >> include the next score below the lowest and the next score above the highest scores to the X axis. Frequency (f) Histogram vs. Polygon Weight at Birth (kg) Weight at Birth (kg) with large data sets, bars of a easier to draw a polygon when you histogram get thinner and thinner. have a lot of scores. Polygons are more consistent with the notion of illustrating continuum. When the variables are discrete in their nature, it is usually histogram that is used. Polygons are better when 2 distributions are depicted in one graph (with 2 different colors). Features of Distributions: Symmetry Symmetry vs. Asymmetry A symmetrical distribution is a distribution for which each tail is a mirror image of the other. >> The far left and right portions containing the low frequency extreme scores are called the tails of the distribution (i.e. very few people with very short / long height in the population) >> The normal distribution (bell curve) is a symmetrical distribution. Features of Distributions: Symmetry Symmetry vs. Asymmetry Asymmetry can arise when there is a constraint at one or other end of the distribution – either a natural constraint or because a test is too hard or too easy. Ceiling Effect: test is too easy Floor Effect: test is too hard Describing Frequencies in Categorical Data: Bar Chart A bar chart is a useful method of summarising categorical data Frequencies in each category are represented by a bar It is also used to represent the means of each category. Weight (gr) 7 at birth Sex 2900 G 6 Frequency (f) 3200 G 5 4100 B 4 3850 G 3 4150 B 2 2950 B 1 4300 B 2500 G Girl Boy 4000 G 3100 Adjacent bars usually do not touch in bar charts as B they represent distinct categories. 3800 B The graphs show number of girs and boys in the 3600 G sample. 2950 B Relative Frequency Histogram Considered alone, an index of frequency is not easily interpreted: Suppose we are only told that in an experimental paradigm that 3 participants responded to the test stimulus in 1 second (RT = 1 s): This does not tell us much unless we know the total number of participants. So we need a relative frequency index. This also helps us compare different samples. Relative Frequency Histogram RT f p A relative frequency histogram indicates the 1.5 1 0.05 proportion (p) of times that a score occurred. 1.4 1 0.05 1.3 6 0.30 To convert a frequency distribution into a relative 1.2 0 0.00 frequency distribution, the frequency for each value is 1.1 2 0.10 divided by the total number of values: 1.0 3 0.15 0.9 4 0.20 p=f/N 50.8 3 0.15 + i.e. p (for RT 1.5) = 1 / (1+1+6+0+2+3+4+3) = 0.05 1.00 Proportion (p) >> Sum of p values equals to 1. 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 Reaction time (RT) in s Relative Frequency Histogram Relative frequencies are an important relationship to probabilities: Number of observations favouring event A probability (A) = Total number of possible observations i.e. The probability of rolling 2 on a die: 1 / 6 =.17 (only one 2 on a die, out of 6 possible observations from 1-6) i.e. The probability of randomly selecting an RT score of 1.5 s from the 20 RT scores (on the previous slide): 1 / 20 = 0.05 The probability of randomly selecting a score from a distribution of scores will always be equal to the relative frequency for that score. When a relative frequency is multiplied by 100, it reflects the percentage of times that the score occurred i.e. 0.05 x 100 = 5% Probability and Relative Frequency A relative frequency indicates the proportion of times that some score was previously observed A probability represents the likelihood of observing that score in the future which can be used for prediction. Central Tendency Central Tendency Measures Measures of central tendency are statistics describing the center or location of a distribution The three most commonly used measures are: The mode The median The mean The Mode One way to describe where most of the scores in a distribution are located is to find one score that occurs most frequently The modal value (or simply mode) is the most frequency occurring score in a sample of scores. The modal class is the class interval containing the highest frequency of scores (quantitative variable). It is often used for nominal data Frequency (f) The Mode in a Bimodal Distribution What is the mode? Answer: 22 and 51 22 51 If more than two modes: Multimodal ~ Problem with the Mode ~ Gives us a limited information about a distribution’s central point. >> might be misleading (according to the shape of the distribution) i.e. 1 1 18 19 20 15 1 21 24 17 22 23 The Median Median is the middle score when scores are ranked in order of magnitude By definition, equal numbers of scores lie above and below the median With an odd number of scores, With an even number of scores, the median is an actual score in there is no middle value. the sample. i.e. 91 93 98 103 108 116 121 252 i.e. 93 98 103 108 116 121 252 >> arrange scores into ascending order. >> arrange scores into ascending >> count the number of scores (n) order. >> n+1 / 2 = x >> count the number of scores (n) >> a = x-0.5; b = x+0.5 >> n+1 / 2 = x >> the median is the average of ath >> the median is the xth score. and bth scores. 103 + 108 / 2 = 105.5 Can be used with ordinal, interval and ratio data (Why?) Because it has no numerical order, it cannot be used with nominal data The Mean The (arithmetic) mean is simply the sum (total) of the scores divided by the number of scores ΣY Y1 + Y2 + Y3.....YN Y = = N N Where Y1 + Y2 + Y3.....Yi are the scores. Y is the mean (‘Y bar’) Σ is the summation sign (Sigma) Σ is used to denote the operation of addition. Thus ΣY means: “add up all the values of Y” N is the number of scores The Mean Score f What is the mean? 8 1 7 0 N = 12 6 2 5 3 ΣY (8 + 6 + 6 + 5 + 5 + 5 + 4 + 4 + 4 + 3 + 3 + 1) 4 3 Y = = 3 2 N 12 2 0 1 1 = 4.50 Summary of the Central Tendency Measures The Mean: sum of all scores divided by the number of scores. The Median: the score in the middle when the scores are ordered. The Mode: the most frequent score. If the distribution is symmetrical (and has no outliers) the mean, median and mode will have the same value. The mean is most commonly reported. Normal Distribution mean median mode Comparing the Central Tendency Measures The mean is sensitive to the exact values of all scores in the distribution. >> It is very sensitive to extreme scores (outliers). The median is less sensitive than the mean to outliers, or to extreme values in the tail of a skewed distribution. i.e. Scores Median Mean 4, 6 ,7, 10 6.5 6.75 4, 6, 7, 100 6.5 29.25 4, 6, 7, 1000 6.5 254.25 Revisiting Skewness (Also discussed as ceiling effect and floor effect) A skewed distribution is a non-normal or asymmetrical distribution with only one pronounced tail. A positively skewed distribution A negatively skewed distribution has a dense concentration at low has a dense concentration at high values with a few outcomes with values with a few outcomes with very large values. very small values. Frequency (f) Frequency (f) 0 Scores Scores Comparing the Central Tendency Measures Measurement Scale Measures you can Best measure of the use middle Nominal Mode Mode Mode Ordinal Median Median Mode Interval Median Symmetrical Data: Mean Mean Skewed Data: Median Mode Ratio Median Symmetrical Data: Mean Mean Skewed Data: Median Other Location Statistics: Percentiles and Quartiles The concept underlying the median can be extended to other locations in a distribution. Background for Computing Percentiles: Cumulative Frequency Cumulative frequency (cf) is the frequency of all scores at or below a particular score. It is an additional column to the frequency table just like relative frequency column. To compute a score’s cumulative frequency, add the simple frequencies for all scores below the score with the frequency for the score. i.e. Score f cf 17 1 20 16 2 19 Note that the cf for the highest score equals N 15 4 17 14 6 13 13 4 7 12 0 3 11 2 3 10 1 1 Other Location Statistics: Percentiles A score’s percentile is the percent of all scores in the data that are at or below the score. i.e. If the score of 80 is at the 75th percentile, this means that 75% of the sample scored at or below 80. Formula: Score’s Percentile = (cf / N)*100 i.e. Score f cf Percentile 17 1 20 100 In this example, a person 16 2 19 95 scoring 12 or below is at the: 15 4 17 85 (3 / 20)*100 = 15th percentile 14 6 13 65 13 4 7 35 Frequency → Relative Frequency 12 0 3 15 (in percentages) 11 2 3 15 10 1 Cumulative Frequency → Percentile 1 5 Locating Percentiles in a Graph The percentile for a given score corresponds to the percent of the total area under the curve that is to the left of the score. Calculating the Percentile The percentage of scores in a distribution that occur at or below a given value, X, is the percentile rank of that score. The value that corresponds to a given percentile rank is called a percentile. Xp = Percentile of interest P = Pth percentile expressed in the form of a proportion L = Lower limit of the category that contains the Pth (N)(P) - nL percentile Xp = L + i N = Total number of scores in the distribution nw nL = Number of scores that are less than L nw = Number of scores that are within the category that contains the Pth percentile. i = Size of the interval of the category that contains the Pth percentile (difference between upper and lower limit) i.e. Frequency Analysis of Intelligence Test Scores for 200 Children. 70th percentile? Intelligence f rf cf crf 105 9.045 200 1.000 104 16.080 191.955 103 20.100 175.875 Greater than 102 26.130 155.775 or equal to.70 101 29.145 129.645 100 30.150 100.500 99 25.125 70.350 98 21.105 45.225 97 14.070 24.120 96 10.050 10.050 Xp = Percentile of interest (N)(P) - nL P = Pth percentile expressed as a proportion Xp = L + i L = Lower limit of the category containing the Pth nw percentile N = Total number of scores in the distribution nL = Number of scores that are less than L (200)(.70) - 129 nw = Number of scores that are within the category X70 = 101.5 + 1 that contains the Pth percentile. 26 i = Size of the interval of the category that contains = 101.92 the Pth percentile (difference between the limits) Calculating the Percentile Rank Percentile rank is the percentage of scores in a distribution that are at or below X. Intelligence f rf cf crf 105 9.045 200 1.000 104 16.080 191.955 103 20.100 175.875 102 26.130 155.775 Percentile rank for 101 29.145 129.645 a score of 101? 100 30.150 100.500 99 25.125 70.350 98 21.105 45.225 97 14.070 24.120 96 10.050 10.050 PRx = Percentile rank of a given X score (.5)(nw) + nL N = Total number of scores in the distribution PRx = (100) nL = Number of scores that are less than X. N nw = Number of scores that are equal to X. (.5)(29) + 100 PR101 = (100) = 57.2 200 Measures of Variability Variability Variability is the extent to which scores are alike or different Distributions with the same mean can have different amounts of spread around the mean (similar central tendancy, different variance) Frequency (f) Distributions with different means can have same amounts of spread around the mean (different central tendancy, similar variance) Measures of Variability The range The interquartile range The mean absolute deviation The sum of squares The variance The standard variation The Range The range is simply the difference between the highest and the lowest score. i.e. 5 7 12 16 40 The range = Highest score – Lowest score = 40 – 5 = 35 Major disadvantages: >> It is based on only two scores in the distribution. >> It is vulnerable to outliers. The Interquartile Range The interquartile range (IQR) is the distance between the values of the third and the first quartiles*. *quartile: In a ranked set of data (from lowest to highest), the three points that divide the data into four equal groups, each group comprising a quarter of the data. Note that second quartile (Q2) is the median! The first (lower) quartile (Q1) is the value that divides the lower 25% of the scores from the upper 75%. The second quartile (Q2) is the median which divides the lower 50% of the scores from the upper 50%. The third (upper) quartile (Q3) is the value that divides the lower 75% of the scores from the upper 25%. The Interquartile Range Set A: 10 13 11 6 12 9 10 14 8 12 10 13 Order the data from lowest to highest: 6 8 9 10 10 10 11 12 12 13 13 14 Find the median: 6 8 9 10 10 10 11 12 12 13 13 14 6 8 9 10 10 10 11 12 12 13 13 14 Find the range medians of the new two sections: 12.5-9.5 = 3 >> interquartile range Unlike the range, the interquartile range is not as sensitive to distortions from the extreme scores. i.e. 6 8 9 10 10 10 11 12 12 13 13 100 range = 100 – 6 = 94 interquartile range = 12.5 – 9.5 = 3 Measures Based on Deviation Scores All scores are taken into account (unlike the range and interquartile range) Y Y–Y 2 - 1.5 n = 10 (Total number of scores) 1 - 2.5 Y = 35 / 10 = 3.5 (Y bar is the mean) 4 + 0.5 3 - 0.5 Σ(Y-Y) 4 +0.5 5 +1.5 A problem with these deviation scores is that they sum 7 +3.5 to zero, no matter how much or little variability the 2 - 1.5 distribution might actually contain 3 - 0.5 4 +0.5 + + Therefore they cannot be used as an aggregate 35 0.0 measure of variability (the mean minimises the sum of signed deviations) The Mean Absolute Deviation One way to get around this sum-to-zero problem is to take the absolute* deviations about the mean. * In mathematics, the absolute value (or modulus) |x| of a real number x is the non-negative value of x without regard to its sign. Y Y–Y |Y – Y| 2 - 1.5 1.5 1 - 2.5 2.5 4 + 0.5 0.5 3 - 0.5 0.5 4 +0.5 0.5 5 +1.5 1.5 Σ |(Y-Y)| 7 +3.5 3.5 N 2 - 1.5 1.5 3 - 0.5 0.5 4 +0.5 0.5 The average of the absolute deviations from the mean is a valid (but little used) measure of dispersion Sum 35 0.0 13.0 Mean 3.5 0.0 1.3 The Sum of Squares Another (and more common) way to get around the sum-to-zero problem is to square each deviation score. Y Y–Y (Y – Y)2 2 - 1.5 2.25 1 - 2.5 6.25 The sum (total) of these values is 4 + 0.5 0.25 called the sum of squares. 3 - 0.5 0.25 4 +0.5 0.25 5 +1.5 2.25 SS = Σ(Y-Y)2 7 +3.5 12.25 2 - 1.5 2.25 3 - 0.5 0.25 4 +0.5 0.25 Sum 35 0.0 26.50 Mean 3.5 0.0 The Sum of Squares (SS) The disadvantage of the SS as a measure of dispersion: the size of SS depends not only on the amount of variability among scores,but also on the number of scores (N). Check these 2 distributions: i.e. Set A Set B Y Y–Y (Y – Y)2 Y Y–Y (Y – Y)2 2 -2 4 4 -1 1 4 0 0 4 -1 1 6 +2 4 4 -1 1 4 -1 1 Sum 12 0 8 4 -1 1 6 +1 1 Although there appears to be greater 6 +1 1 variability in Set A than Set B, the SS of Set B 6 +1 1 is bigger as there are more scores in Set B. 6 +1 1 Mean of Set A = 4 6 +1 1 Mean of Set B = 5 2 and 6 >> 2 units distant from 4 4 and 6 >> 1 unit distant from 5 Sum 50 0 10 Variance The variance (s2) is the sum of squares (SS) divided by the number of cases (N): Σ(Y-Y)2 s2 = N s2 2.67 s2 1.00 Standard Deviation The standard deviation (s) is the positive square root of the variance s = √s2 Unlike the variance, the standard deviation (SD) is in the same units of measurement as the original scores (the square is eliminated). The SD thus represents an average deviation from the mean. Note that the sum of squares, the variance, and the standard deviation will always be greater than or equal to zero (0); because they are all based on squared deviation scores. If any takes a value of 0, it means that there is no variability in the data set. Sample vs Population Notation (statistic vs. parameter) ~ The Sample Variance (s2) ~ When we have just a sample from the population (most of the time): the sample variance is notated as: Σ(Y-Y)2 s2 = N ~ The Population Variance (σ2) ~ If our dataset consists of the whole population (a rare occurrence): the population variance is notated as sigma squared. Σ(Y-µ)2 µ = (Greek letter Mu) the mean of the population σ2 = σ = Greek small letter Sigma N Sample vs Population Notation ~Standard Deviation ~ Σ(Y-Y)2 s = Sample standard deviation (s) N This is a statistic Σ(Y-µ)2 σ = Population standard deviation (σ) N This is a parameter Boxplots Boxplots provide a simple graphical representation of data that captures features of both the location and spread of scores in a distribution Boxplots Boxplots provide a simple graphical representation of data that captures features of both the location and spread of scores in a distribution A box plot consists of three main parts: 1) A box that covers the middle 50% of the data. The edges of the box are the 1st (Q1) and 3rd (Q3) quartiles. A line is drawn in the box at the median value. 2) Whiskers that extend out from the box to indicate how far the data extend either side of the box. 3) All points that lie outside the whiskers are plotted individually as outlying observations (outliers). Boxplots i.e. reaction times in naming colours (children vs. adults) Children named the colours faster than adults (+ = mean, lines = median) Half the children’s times are between 17 and 20, whereas half the adult’s times are between 19 and 25. One child was slower than almost all the man. Children Adults Plotting boxplots of measurements side-by-side can be illustrative. For example, boxplots of reaction times for each gender in a certain task indicate that the distributions have quite different shapes. Boxplots Although a boxplot could tell you whether a distribution is symmetrical or not, it does not tell you the exact shape it has got: Frequency histogram Boxplots A B A B Boxplots How to assess symmetry and skewness in boxplots? Whisker length might be misleading with small samples N = 20 Symmetrical Positively Skewed Negatively Skewed Check the position of the median in relation to Q3 and Q1 N = 500 Symmetrical Positively Skewed Negatively Skewed Boxplots How to assess symmetry and skewness in boxplots? Check the position of the mean in relation to the median: EXERCISE The Variance (𝞂2) and Standard Deviation (SD or 𝞂) Q1: Calculate on your own the variance and the SD of the dataset below, and check your answer with the solution below. Features of Distributions: Pointyness (Kurtosis) Kurtosis is the degree to which scores cluster at the ends of the distribution (known as tails) and how pointy a distribution is In a normal distribution, the values of skew and kurtosis are 0 A distribution with positive kurtosis has many scores in the tails (a so-called heavy-tailed distribution) and is pointy (known as leptokurtic distribution) A distribution with negative kurtosis has relatively light tails and tends to be flatter than normal (known as platykurtic distribution) Q2: Are kurtosis and variance related? How? Bias and Precision Bias is the tendency for the measurements (observations) to differ systematically from their true value (it describes systematic errors) Precision is the closeness of agreement between independent test results obtained under stipulated conditions (how close the measurements are to each other). Precision is concerned with the amount of random variability in measurement. The less the random variability, the greater the precision (it describes random errors) No bias – High precision No bias – Low precision Bias – High precision Bias – Low precision >> repeatability: precision under similar conditions. >> reproducibility: precision under different conditions. Q3: Are central tendency and dispersion measures related with bias and precision? How? PSY201 INFERENTIAL STATISTICS DR IREM ECE ERAYDIN Postdoctoral Research Fellow University of Southampton, UK Psychedelics BSc Psychology – Middle East Attention Technical University ADHD MSc Cognitive Neuroscience – King’s College London Psychopharmacology PhD Neuroscience – University of Neuromodulation Manchester [email protected] Aims To introduce inferential statistics To introduce sampling methods and estimating population parameters Learning Outcomes Student will define statistical inference and its role in psychological research Students will understand the outcomes of different sampling methods Students will understand population parameters and sample statistics 3 Part 1 Probability vs Statistics Basic Probability Theory Distributions Part 2 Samples, Populations and Sampling Law of Large Numbers and Central Limit Theorem Estimating Population Parameters Estimating Confidence Intervals 4 If you are distracted/bored during the lecture: and cannot regain your focus for a while, in the meantime you can: Just look at the memes (if there’s any) and try to think how does it related to the content at the Istatistigi time Try to join interactive bits as much as possible – it’ll help, I promise! When I am giving examples (some of the examples are the things you might have experienced in your lives) – try to think what would be the numbers/probabilities in your experiences (self - reference effect) 5 Probability vs Statistics Probability Statistics ▪ The measure of how likely an event is to occur – ▪ The science of collecting, analysing, interpreting, and predicting the likelihood of future events based on presenting data - focuses on understanding the past data known information to make informed decisions ▪ Deck of cards – “What is the probability of drawing a ▪ Conduct a survey in the uni, asking everyone in the class heart?” about their favourite ice-cream. 52 cards, 13 of those are hearts 30 students prefer chocolate, 20 prefer vanilla and Probability of drawing heart is 13/52 =1/4 or 25% 10 choose strawberry Making a prediction about what might happen in “Based on the survey, 60% of the students the future based on what you know about the deck prefer chocolate, 40% prefer vanilla and 20% prefer strawberry.” 6 ASPECT PROBABILITY STATISTICS Predicting future events Analysing past data to draw FOCUS based on known data conclusions Theoretical; deals with Empirical; deals with actual data NATURE possible outcomes collected Used to calculate the Used to summarise, interpret and PURPOSE likelihood of an event present data “What is the chance of rain “Last month, it rained 10 out of 30 EXAMPLE tomorrow?” days.” Often involve mathematical Involves data collection, PROCESS models organisation and analysis. Example: Deciding whether to ask someone out Probability - The probability of getting a date Statistics – Analysing classmate responses ▪ Jordan is contemplating whether to ask their classmate, ▪ After asking a few friends about their experiences asking Alex, out someone out, Jordan collects data. ▪ Situation: After chatting and hanging out few times, ▪ Situation: Jordan surveys 10 friends who have asked Jordan thinks there is a 70% chance that Alex will say someone out recently and finds that 6 of them received a yes positive response. ▪ Probability Calculation: This estimate is based on their ▪ Statistical Analysis: From this, Jordan concludes that there previous interactions, shared interests, and feedback is a 60% success rate among friends who have asked from mutual friends who think Alex might be interested. someone out 8 Probability Frequentist View Long-run frequency of an event – repeated trials Objective – not influenced by prior beliefs of knowledge, under the same conditions only uses data from the current, observable repetitions. Avoiding cleaning during exam season is not A coin has a probability of 0.5 for landing heads taken into account. – infinite flips E.g. Probability of roommate cleaning kitchen each week: Cleaned kitchen 8 of the 20 weeks P (Clean) = 8/20 = 0.4 If you do this infinite times, you’ll find the probability that is closer to the actual probability 9 Accept fundamental lack of information and proceed without it Probability Frequentist View Focuses on fixed parameters and hypothesis Confidence Intervals (CI) over probabilistic statements testing about parameters - estimate where the true parameter might lie, based on repeated sampling – not a probability Parameters (average or proportion etc.) are 95% CI of 0.25 to 0.55 - means that in 95% of fixed values in the population. similar studies, roommate’s true cleaning Hypothesis testing is used to make frequency would fall between these values. decisions based on these estimates E.g. Testing if roommate’s cleaning frequency is really 40% - Null hypothesis vs alternative hypothesis 10 Accept fundamental lack of information and proceed without it Probability The Frequentist View Larger sample sizes lead to greater accuracy Improve the stability and reliability of probability estimates and hypothesis test E.g. Tracking roommate 2 weeks vs 50 weeks Emphasis on p-values in decision making. Rely on p-values – how likely we are to observe our data if the null hypothesis is true E.g. Collect data, get a low p value (0.03) – conclude that cleaning behaviour is significantly different from the expected rate 11 “ Flipping Coins https://observablehq.com/@mattiasvillani/law -large-numbers 12 “ Characteristic Definition Example Probability as Long-Run Frequency Probability defined by repetition and Observing roommate’s cleaning over frequency 20 weeks to estimate a 40% cleaning probability No Prior Knowledge or Beliefs Only sample data matters; no Ignoring exam season and only subjective input considering data in calculating cleaning probability Fixed Parameters & Hypothesis Parameters are fixed; hypotheses are Testing if you are true cleaning rate is Testing tested based on sample data below 40% Confidence Intervals CIs indicate where the parameter 95% CI for cleaning rate may be 0.25– would fall in repeated samples, not a 0.55 but doesn’t imply a 95% chance probability of being correct Larger Samples = Greater Accuracy Larger data sets yield more stable, 50 weeks of data provides a clearer accurate estimates estimate than only 2 weeks Reliance on P-Values P-values guide decisions about Low p-value (e.g., 0.03) suggests rejecting the null hypothesis roommate’s cleaning is significantly different than expected Probability The Bayesian View Sees probability as a measure of belief or degree of certainty about an event. Update the beliefs with new evidence using Bayes’ theorem. New info – update the probability Subjective and depends on the observer’s knowledge – incorporates prior knowledge - intuitive E.g. Roommate has 50% of cleaning rate based on past behaviour Recently promised that they will be tidier – updated belief if 70% 15 Probability The Bayesian View Focus on the posterior distribution Instead of testing hypotheses with p-values Bayesian analysis focuses on the calculating the posterior distribution – represents the updated probabilities of different outcomes Fleixibility in decision-making Probability of roommate cleaning is 65% - stick to the current plan. If it falls below 30% - propose a new cleaning plan Flexibility to interpret the results in the context of their specific needs 16 Fint the initial number that seems reasonable and hope for the best – today’s posterior tomorrow’s prior Probability The Bayesian View Updates with new information Finals are approaching – drop in probability of roommate’s cleaning Requires specification of priors (conditional probability) Analysis requires the use of prior probability – initial belief about the event’s probability The likelihood of an outcome occurring based on a previous outcome in a similar circumstances Can significantly influence the posterior if the sample size is small. 17 The Bayesian View – Real-life example of Bayesian kind of thinking When doctors diagnose patients, they often use a thought process similar to Bayesian thinking, even if they don’t call it that. Doctors use a process of continuous updating based on new evidence, just like Bayes' approach: they start with a best guess, gather information, and keep refining their diagnosis as they learn more. Here's how it works step-by-step: Start with a general idea: When a patient first comes in, a doctor might have a rough idea of common illnesses based on the patient’s age, symptoms, and the season (like flu season). This is their starting point or *prior* belief about what the patient might have. Gather information (new evidence): As the doctor examines the patient and learns more about specific symptoms (like a sore throat, fever, and body aches), they update their initial idea. This new evidence helps the doctor focus on diagnoses that fit better with what they're seeing. Refine the diagnosis: The doctor now uses this updated information to narrow down the list of possible illnesses, maybe ordering tests to confirm or rule out certain conditions. For example, a test result showing a high level of white blood cells might make an infection more likely. Keep updating: If new test results come back or symptoms change, the doctor keeps adjusting their diagnosis. This is just like updating probabilities in Bayesian thinking—the doctor is always looking at new evidence to refine their understanding of what the patient likely has. 18 Characteristic Definition Example Probability as Degree of Belief Probability reflects belief about an You believe roommate has a 70% event, considering both data and chance of cleaning due to past data subjective input and roommate is promise to improve Incorporation of Prior Knowledge Prior belief combines with data to form Your prior belief (50%) updates to 65% a new, updated belief based on recent behaviour and Taylor’s commitment Posterior Distribution Updated probabilities of outcomes Posterior shows 80% probability based on new data roommate’s cleaning rate is better than 50% Flexible Decision-Making Decisions based on posterior You adjust cleaning expectations probability instead of strict p-values based on posterior probability Updates with New Information Beliefs updated as new data arrives Your probability drops if roommate’s cleaning rate declines during finals Requires Priors Initial belief (prior) must be specified, Your 50% prior impacts their final impacting results especially with probability but would change with a limited data stronger prior belief (e.g., 70%) Who’s who 1: Bayesian or Frequentist? Predicting if your friend will spill a secret 1 2 ▪ Your friend is usually pretty tight-lipped, so you start with ▪ You decide to track how often this classmate actually a 15% chance that they will spill a juicy secret they spills secrets. know. ▪ After a few months, you’ve noted that they have spilled ▪ You notice they are being extra chatty today and even secrets 3 out of the last 15 times they had one mentioned, “I really shouldn’t say this, but…” ▪ Based on these observations, you estimate a 3/15 (20%) ▪ With this new behaviour, you raise your belief to 60% chance they will spill a secret next time. chance that they will spill the secret today 20 Who’s who 2: Bayesian or Frequentist? Guessing if you’ll see your crush in the library 1 2 ▪ You decide your crush’s library habits by going every day ▪ You think there’s a 30% chance your crush will be in the and counting how often they’re there (!!!! Warning: This is library today based on previous experience not advice! Proceeding with this idea might result in your ▪ You see your crush’s best friend heading toward the friends staging an intervention!!!!) library, which could mean your crush is there, too ▪ After two weeks (14 days), you find that your crush was ▪ With this new clue, you update your belief and increase there 4 out of those 14 days. the probability to 70% that your crush is in the library ▪ Based on this, you estimate a 4/14 (about 29%) probability of seeing your crush in the library on any given day. 21 Basic Probability Theory Understanding how likely events are to occur Probability for any event is a number between 0 and 1. Impossible event = 0 Certain event = 1 22 Key Probability Terms Elementary Event Sample Space The simplest possible outcome The set of all possible in a probability experiment that elementary events or outcomes can’t be broken down further. for an experiment. It contains everything that could happen within the given scenario. Law of Total Probability Non-Elementary Probability Distribution Events States that if we add up the Assigns probabilities to each A combination of two or probabilities of all possible event in the sample space. It more elementary events. elementary events in the allows us to understand the sample space, the total must likelihood of each outcome. equal 1 (or 100%). 23 Example: Friday Night Plans with Friends Imagine you are deciding what to do on a Friday night with friends. You have three options: Go to a movie Have a game night Go out for dinner Elementary Event – Specific outcome you might choose for the night. Each choice – movie, game night or dinner is an elementary event since they are basic, indivisible options for the evening Elementary events: "Go to a movie," "Have a game night," "Go out for dinner.“ Non-Elementary Event - combination of two or more elementary events. Doing something indoors: {Movie, Game Night} Sample Space – Includes all possible elementary events – complete list of activities Sample space (S): {Movie, Game Night, Dinner} Probability Distribution - assigns a specific probability to each event Movie: 0.4 Game Night: 0.3 Dinner: 0.3 Law of Total Probability - states that if you add up the probabilities of all events in the sample space, it should equal 1. Let’s assume you’ve estimated how likely each activity is based on how you and your friends usually spend your Friday nights. P (Movie) = 0.4 P (Game Night) = 0.3 P (Dinner) = 0.3 24 P (Movie) + P (Game Night) + P (Dinner) = 0.4+0.3+0.3=1 DISTRIBUTIONS The Binomial Distribution Probability distribution that describes the number of successes in a fixed number of independent trials, each with the same probability of success. “How likely is it that we’ll get a certain number of ‘success’ in a fixed number of trials?” Used when there are only two possible outcomes each time we try something – success or failure Key Terms and Definitions Size Parameter (N) - The total number of trials or events in the experiment. Success Probability (θ) - The probability of achieving a success on any single trial. Random Variable (X) - The number of successes observed in the experiment. 25 Probability Mass Function (PMF) - The function that assigns probabilities to the possible values of the random variable X. Example: Guessing if You'll See Your Crush in the Library Imagine you have a 0.2 probability of running into your crush each time you go to the library. Let’s say you plan to go to the library 10 times this month. Each visit is an independent event with a consistent probability of 0.2 for spotting your crush. Size Parameter (N) - the number of trips you’ll make to the library. Here, N=10. Success Probability (θ) - probability that you’ll see your crush during a single visit to the library, which is θ=0.2 Random Variable (X) - the number of times you’ll see your crush over the 10 visits. Probability Mass Function (PMF) - This function calculates the probability of seeing your crush exactly a certain number of times out of 10, based on N and θ. Let’s say you want to know the probability that you’ll see your crush exactly 3 times in those 10 visits. Since this situation fits a binomial distribution with N=10 and θ=0.2 we use the binomial probability formula; Where; represents the number of ways to choose 3 visits out of 10 on which you’ll see your crush. represents the probability of seeing your crush exactly 3 times. 26 represents the probability that you don’t see your crush during the remaining 7 visits. “ “ DISTRIBUTIONS The Normal Distribution “Bell Curve” or “Gaussian Distribution” Described using two parameters: The mean of the distribution (μ; mu) The standard deviation of the distribution (σ; sigma) – Smooth curve – not histogram like in binominal distribution Binominal is discrete (you can’t see your crush 4.5 times out of 10). Normal distribution can be decimal numbers Widely used in statistics because many natural phenomena – like height, test scores, or reaction times – often follow this pattern 29 DISTRIBUTIONS The Normal Distribution The mean (μ) defines the centre of the distribution The standard deviation (σ) determines how spread-out values are around the means Student dorm – students go to bed around midnight (mean), some stay up later or turn in earlier – SD would describe how much they deviate from midnight. 30 “ Which of the normally distributed cats has a larger standard deviation? What about means? “ See Distributions: https://seeing-theory.brown.edu/probability-distributions/index.html DISTRIBUTIONS Properties of the Normal Curve Symmetry – The curve is perfectly symmetrical around mean Total area equals 1 - The total area under the curve represents the entirety of possibilities (or probabilities) and is always equal to 1. 68 – 95 – 99.7 rule - Approximately 68% of values fall within 1 standard deviation, 95% within 2, and 99.7% within 3. Mean, median and mode are exactly the same.. 33 “ How long it takes for people to text back their crushes? After polling a bunch of people, you find that, on average, it takes 15 minutes for a person to reply, with a standard deviation of 5 minutes. Some people reply instantly (0 minutes) while others wait hours. Using this data: ▪ 68% of people reply within 10 to 20 minutes (1 standard deviation). ▪ 95% reply within 5 to 25 minutes (2 standard deviations). ▪ The rare few (in the 99.7% range) might take longer, making their crush wonder if they’re playing hard to get or simply missed the message altogether, if your friend has been waiting more than 30 minutes for a reply, they might be in that rare zone—maybe their crush is taking the “mystery” tactic to new heights! “ 0.3% 0.3% Probability Density Way of describing where values are likely to ‘bunch up’ within a range. Continuous data – things that can take any value (height, distance or time) – we can’t say the probability of any exact value. So, we look at ranges. “How likely is someone to exactly reply a message in 5 minutes?” – can be 4 min 59 sec 10 milisec or 5 mins 25 sec etc. What’s the difference between probability and probability density? The height of the curve shows us how likely values are to gather around certain areas. But it’s the area under the curve (between two points) that represents the actual probability of values falling within that range.. Doesn’t give us probabilities - helps us understand how values are distributed. It determines the shape of the curve, and thus the distribution of area underneath it. Where the PD is high (such as near the mean in a normal distribution), the area under the curve over a small interval will be larger, representing a higher probability of values within that interval 36 Area Under the Curve Total area under the whole curve is always 1. Why? How do we find area under the curve in standard normal distribution? Remember yesterday’s lecture. What AUC tells us: If you want to know the probability that an adult’s height falls between 160 cm and 180 cm, the area under the curve between these heights will tell you. If the area is 68%, it tells you that 68% of adults fall between 160 cm and 180 cm in height, or that a randomly chosen adult has a 68% chance of being in that range.. Interactive Graph of Area Under the Curve: https://onlinestatbook.com/2/calculators/normal_dist.html 37 PSY201 INFERENTIAL STATISTICS DR IREM ECE ERAYDIN Postdoctoral Research Fellow University of Southampton, UK Psychedelics BSc Psychology – Middle East Attention Technical University ADHD MSc Cognitive Neuroscience – King’s College London Psychopharmacology PhD Neuroscience – University of Neuromodulation Manchester [email protected] PART 2 Estimating Unknown Quantities from a Sample Chapter 10 (Navarro, 2015) Descriptive Statistics vs Inferential Statistics Descriptive Inferential Summarising and describing the data you have in hand Making predictions or draw conclusions about a larger population based on data collected from a sample ▪ Only describing the sample itself – with no conclusions beyond it ▪ Goes beyond the data you have E.g. Number of times a student forgets to mute ▪ Make predictions about a larger group using smaller themselves in an online lecture and accidentally sample blurts out something embarrassing mid-lecture during one semester in the psychology department. E.g. “How often do students accidentally send a gossipy text about someone to that person instead of Avg: 2 | Max: 5 | Min: 0 their best friend?” Average, maximum, minimum, median, mode, Survey 45 students – find average “oops, wrong range. frequency, variance, standard deviation, person” mistakes per semester. percentiles etc. Estimate how frequently this happens across all students at the university. 4 Samples, Populations, and Sampling Population: The entire group you are interested Goal – Making inferences about a “population” in (all students in the university) No access to whole population – we have a subset of the Sample: A smaller, manageable group from the population population (Dept. of Psychology + Dept. of Engineering) E.g. “The global prevalence of ADHD in children is estimated to be around 5%” Probability theory is the foundation all statistical theory builds upon And sampling theory is the frame – to select a sample from a population in a way that accurately represents the entire population. Selecting right sample is crucial! – reliable and valid results that are applicable to the population 5 Why sampling method matters? Representation Accurately reflects the characteristics of the larger population Generalisability Ensure inferences are valid and can be generalised to the population Bias Minimising selection bias – making sure no groups are overrepresented or underrepresented. Statistical Power Well-structured sample allows us to detect true effects – accurate estimates. Confidence in Results Ethical Considerations Groups that are impractical or unethical to study entire population – potential burden on compromised health, revoking trauma etc. 6 “ What is the worst thing that improper sampling could cause? “ Sampling Fail: How Gender Bias Skews Medical Research Under-research/improper sampling on the presentation and symptoms of health problems in women - Majority of the medical research data collected on males and generalised to females Cardiovascular disease - symptoms are different in women – misdiagnosis or late diagnosis – women are more likely to suffer major adverse events and overall mortality ( Merone et al., 2022 – see notes for references) Pain research - pain management strategies Pharmaceutical research – dosing and side effects, how drugs affect women Endometriosis – Diagnosis takes 4 to 11 years Types of Sampling Probability Sampling Non-Probability Sampling Utilise random selection – all members of the population Not all members of the population have an equal chance of have a chance of being selected being selected ▪ Simple Random Sampling ▪ Convenience Sampling ▪ Stratified Sampling ▪ Purposive Sampling ▪ Cluster Sampling ▪ Snowball Sampling ▪ Systematic Sampling. ▪ Quota Sampling ▪ Self-Selection Sampling. 9 Simple Random Sampling Every individual has an equal chance of being selected – selection of one does not affect the other Random process – same procedure leads different results each time Can be with or without replacement – observe the same member multiple times or not Tombala (without replacement) Random number generator https://www.randomlists.com/team-generator https://www.randomizer.org/ Drawing names from a bag/hat. E.g. How did I randomise the mice into different medication groups 10 Stratified Sampling Divide population into subgroups (strata) and then sample from each subgroup Still random Schizophrenic and not-schizophrenic; Oversampling – intentionally exaggerate the representation of rare groups E.g. How did I randomise the mice into different medication groups – this time with 3 subgroups (high attention, medium attention, and low attention). 11 Other Probability Sampling Types Cluster Sampling Systematic sampling Population divided into clusters (groups) and entire clusters Selecting participants at a regular intervals from an ordered are randomly selected. list of the population E.g. Research on mental health behaviours of high school ▪ Population must be arranged in some logical order students (numerical, alphabetical etc.) Each high school in the city being a cluster ▪ A sampling interval (k) is determined by dividing the total Randomly select a few schools population size (N) by the desired sample size (n). ▪ Random starting point is selected within the sampling interval(k) then select every k-th member until you reach n. E.g. Company with 1000 employees – need to survey 100 employees. k (sampling interval) = 1000/100 = 10 Randomly select a number between 1 to 10 as a starting point Select every 10th employe starting from the 4th employee (4th, 14th, 24th etc.) 12 Non-Probability Sampling Convenience Sampling Participants are selected based on their availability and proximity to the researcher. Quick data collection Non-random – potential bias Limited generalisability E.g. Psychology student conducting a survey for an assignment Rather than randomly selecting students from all over the campus, student survey classmates, friends etc.. 13 Non-Probability Sampling Snowball Sampling Relies on referrals from existing participants o identify and recruit new participants. Snowball effect – each participant help recruit more participants Often used in qualitative research Hard-to-reach or hidden or marginalised populations Homeless individuals, substance users, migrant and refugees, rare medical conditions Non-random – potential bias Difficulty in controlling sample size. 14 Other Non-Probability Sampling Types Purposive Sampling Quota Sampling Self-Selection Sampling Selecting participants based on Researcher ensures that specific Individuals volunteer to participate in a specific characteristics or criteria characteristics of a population are study rather than being randomly selected. relevant to the research study. represented in the sample by setting quotas for those ▪ Participants’ interest or willingness to ▪ Using judgment to choose characteristics. contribute individuals who are likely to ▪ Once the quotas are filled, no ▪ Conducting online survey on a mental provide the most informative additional participants are health data. added, even if they meet the criteria. ▪ E.g. Posting your survey link on social ▪ E.g. People who have been media inviting people to participate diagnosed with diabetes for at ▪ E.g. Survey to voter preferences least 5 years Set quotas based on demographic ▪ E.g. Market research – frequent factors such as age, gender, and user of a specific product political affiliation to ensure the sample reflects the overall voter population (60% men, 40% women) 15 Why Samples Aren’t Perfect?: Population Parameters and Sample Statistics Even with the best sampling methods, samples rarely reflect the population perfectly due to sampling error. Law of Large Numbers (LLN) says the larger our sample, the closer our sample mean will get to the true population mean AKA more data leads to better estimates E.g. Ask your friends about their favourite spots in the campus. Few answers, you might get few different spots. Ask more students, one or two places start to emerge as favourite. so you can find which spots will be more crowded. E.g. Tossing a coin example from the book (Chapter 9) – p=0.5 16 “Rolling Dice: https://digitalfirst.bfwpub.com/stats_applet/stats_applet_11_largenums.html Flipping Coin: https://observablehq.com/@mattiasvillani/law-large-numbers Other interactive LLN graphs: https://demonstrations.wolfram.com/IllustratingTheLawOfLargeNumbers “ Sampling Distributions and the Central Limit Theorem (CLT) Central Limit Theorem says if you take large enough samples from a population, the samples’ mean will be normally distributed – regardless of the original distribution of the population. Can anyone guess what the main difference is between the CLT and the LLN?. 19 Interactive Graphs “See Distributions: https://seeing-theory.brown.edu/probability-distributions/index.html Central Limit Theorem: https://pmplewa.github.io/clt.html Central Limit Theorem 2: https://stats-interactives.ctl.columbia.edu/central-limit-theorem CLT vs LLN In the Central Limit Theorem (CLT): You’re taking multiple samples from the population. For each sample, you calculate a sample mean. As you collect more and more of these sample means, their distribution will start to look normal, regardless of the shape of the original population distribution (as long as each sample is large enough). In the Law of Large Numbers (LLN):. You’re working with one large sample (or observing an event repeatedly within a single large sample). As you keep increasing the size of this single sample, the sample mean will get closer to the population mean. 21 CLT vs LLN Feature Law of Large Numbers (LLN) Central Limit Theorem (CLT) Convergence of the sample mean to the Main Focus Shape of the distribution of sample means population mean Condition Works as sample size increases Works as sample size increases Distribution of sample mean approaches Key Concept Sample mean approaches population mean normal distribution. Allows us to use normal-based statistics for Application Relates to reliability of the sample mean inference No assumption about population distribution No assumption needed; works regardless of Population Distribution needed original distribution 22 Estimating Population Parameters Sample Statistics Estimates of Population Parameters Summarises a specific characteristic of a sample (a subset An estimate of a population parameter using sample statistics of the population) to infer approximate an unknown characteristic of the entire population ▪ Describe the data within the sample itself ▪ Aim to provide insight into broader population ▪ Known values calculated directly from the sample data ▪ Estimates gives the best guess about the actual value ▪ Provides information about the sample – used to make inferences about to population ▪ Generalises sample findings to the entire population – using confidence interval to express uncertainty 23 Estimating Population Parameters Measure IQs of 100 people in a town and find the average score – estimation of mean IQ of the town’s population We don’t know the actual population mean – estimates always involve some uncertainty So, we need more parameters to help us understand the reliability of the mean (or how uncertain we should be) Standard Deviation How spread out or dispersed the values in a dataset are around the mean – how far each data point is from the mean Low standard deviation – low variability – indicates consistency. High standard deviation – high variability – possible outliers 24 Why the adjustment? Bessel’s correction – using n-1 instead of n Population standard deviation and variance are always larger than sample standard deviation and variance, therefore dividing the rest of the formula with a smaller number will give us larger, therefore better (more realistic) estimate of population standard deviation and variance. 25 Confidence Interval (CI) Range of values that’s used to estimate an unknown population parameter, such as a population mean or proportion, with a certain degree of confidence. It provides an interval within which we believe the true population parameter lies, based on the data from a sample. Provides more information than a single point estimate (mean). Indicates range of values we believe contains the true population parameter Instead of saying “the average score is 75,” a confidence interval allows us to say, “we’re 95% confident the average score is between 70 and 80,” giving us a more nuanced understanding of the data.. 26 “ Interpreting CI Common Misinterpretations and Correct Understanding: Misinterpretation: “There is a 95% chance that the population mean is within the interval.” This is incorrect because each interval either contains the population mean or it doesn’t. The 95% confidence refers to the process—if we repeated the sampling, 95% of the intervals calculated would contain the true mean. Correct Interpretation: If we were to take 100 different samples and calculate a 95% CI for each, about 95 of those intervals would contain the true population mean.. 28 Interpreting CI Common Misinterpretations and Correct Understanding: Imagine you want to estimate the average number of hours students study each week. You survey a random sample of 50 students, and the results give you a sample mean of 12 hours with a 95% confidence interval of [10.89, 13.11]. Incorrect Interpretation: “There is a 95% chance that students’ true average study time falls between 10.89 and 13.11 hours.” This is misleading because the true average either lies in this range or it doesn’t; there’s no probability involved for a specific interval after it has been calculated. Correct Interpretation: “If we were to repeat this process many times, 95% of the confidence intervals we construct would contain the true average study time for all students.” This interpretation emphasises that the confidence level applies to the method used to create the interval, not to this. particular interval. This clarification reinforces that confidence intervals are about long-run probability—over many repeated samples, the interval method will contain the true mean 95% of the time. 29 Interpreting CI - Example Imagine you’re researching how long people typically take to text back their crushes. You survey 40 people and find that, on average, they take 15 minutes to respond, with a standard deviation of 10 minutes. Calculating a 95% confidence interval: Interpretation: Incorrect: “There’s a 95% chance that people, on average, take between 11.9 and 18.1 minutes to text back their crush.”. This misinterpretation implies probability in a single interval, which isn’t accurate. Correct: “If we repeatedly sampled groups of people and calculated confidence intervals for their texting response times, 95% of those intervals would contain the true average response time for people texting their crushes.” This interpretation clarifies that the confidence level pertains to the method over many samples, not to the specific interval. 30 Thank you! Z-scores A Lecture on Standardization, z-Scores and Normal Distribution Teaching Assistant: Saliha Erman, MA Source: Adapted from Assoc. Prof. Güneş Ünal, Boğaziçi University, 2022 Recap Recap from last week Summarizing data: frequency table & frequency histogram Additions to frequency table: relative frequency (rf), cumulative frequency (cf), cumulative relative frequency (crf) -> percentiles There different ways of describing the data: central tendency (mean, median, mode) vs. variance/dispersion (sd, variance, range, iqr, mad) Reading a boxplot: the position of quartiles, mean, outliers + skewness The Sum of Squares Another (and more common) way to get around the sum-to-zero problem is to square each deviation score. Y Y–Y (Y – Y)2 2 - 1.5 2.25 1 - 2.5 6.25 The sum (total) of these values is 4 + 0.5 0.25 called the sum of squares. 3 - 0.5 0.25 4 +0.5 0.25 5 +1.5 2.25 SS = Σ(Y-Y)2 7 +3.5 12.25 2 - 1.5 2.25 3 - 0.5 0.25 4 +0.5 0.25 Sum 35 0.0 26.50 Mean 3.5 0.0 Variance The variance (s2) is the sum of squares (SS) divided by the number of cases (N): Σ(Y-Y)2 s2 = N s2 2.67 s2 1.00 Standard Deviation The standard deviation (s) is the positive square root of the variance s = √s2 Unlike variance, the standard deviation (SD) is in the same units of measurement as the original scores, as the square is eliminated. Unlike the Sum of Squares (SS), variance and SD adjust for sample size by dividing by N. Both MAD and SD are in the same units of measurement as the original scores. The difference between them is that SD squares deviations, which makes it more sensitive to outliers and larger deviations. Additionally, SD is more frequently used because of its relationship with the normal distribution Note that the sum of squares, the variance, and the standard deviation will always be greater than or equal to zero (0); because they are all based on squared deviation scores. If any takes a value of 0, it means that there is no variability in the data set. Exercises The Variance (𝞂2) and Standard Deviation (SD or 𝞂) Q1: Calculate on your own the variance and the SD of the dataset below, and check your answer with the solution below. Standardization & z-Scores