Statistic Summary PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document is a summary of chapters on introduction to statistics, and descriptive and inferential statistics, providing definitions and examples. It also covers measures of central tendency, including mean, median, and mode, and different scales of measurement.
Full Transcript
In blue or highlighted: what was said or emphasized by the lecturers Chapter 1 – Introduction to Statistics Statistics = “the study of how we describe and make inferences from data” (Sirkin); is a branch of mathematics used to summarize, analyze and interpret a...
In blue or highlighted: what was said or emphasized by the lecturers Chapter 1 – Introduction to Statistics Statistics = “the study of how we describe and make inferences from data” (Sirkin); is a branch of mathematics used to summarize, analyze and interpret a group of numbers or observations Two ways of evaluating information: o Descriptive statistics = applying statistics to organize, summarize and make sense of information. Are typically presented graphically, in tabular form or as summary statistics (single values) o Inferential statistics = applying statistics to interpret the meaning of information à to answer a question or make an actionable decision Mark Twain: “There are lies, damned lies and statistics” Data = are measurements or observations that are typically numeric. A datum is a single measurement or observation, usually referred to as a score or raw score Remembering ISSR: Variable is a measured property of each of the units of analysis (e.g.: age, GDP, household income, annual revenue) Descriptive statistics Typically used to quantify the behavior researchers measure Instead of listing each individual score or increase on an exam, we could summarize all scores by stating the average (mean), middle (median) or most common (mode) score among all individuals, which can be more meaningful Inferential statistics Is a conclusion reached on the basis of evidence and reasoning Allow researchers to infer or generalize observations made with samples to the larger population from which they were selected Inferential statistics are used to help the researcher infer how well statistics in a sample reflect parameters in a population Population parameter = a characteristic (usually numeric) that describes a population Sample statistic = a characteristic (usually numeric) that describes a sample The characteristics of interest are typically descriptive analysis Research methods and statistics Science = the study of phenomena, such as behavior, through strict observation, evaluation, interpretation, and theoretical explanation Research method (or scientific method) = a set of systematic techniques used to acquire, modify, and integrate knowledge concerning observable and measurable phenomena Scales of measurement Scales of measurement = rules for how the properties of numbers can change with different uses; imply that the extent to which a number is informative depends on how it was used or measured In blue or highlighted: what was said or emphasized by the lecturers Scales of measurement are characterized by three properties: 1. Order: does a larger number indicate a greater value than a smaller number? 2. Difference: does subtracting two numbers represent some meaningful value? 3. Ratio: does dividing (or taking the ratio of) two numbers represent some meaningful value? Nominal Ordinal Interval Ratio Order No Yes Yes Yes Difference No No Yes Yes Ratio No No No Yes S. S. Stevens = Harvard psychologist who coined the terms nominal, ordinal, interval and ratio Nominal scales = measurements in which a number is assigned to represent something or someone; are often data that have been collected o Group classifications, no meaningful ranking possible, numerical coding arbitrary o E.g.: A person’s sex, race, nationality, sexual orientation, hair and eye color, season of birth, marital status, or other demographic or personal information o Coding (converting a nominal or categorical variable to a numeric value) words with numeric values is useful when entering names of groups for a research study into statistical programs such as a SPSS because it can be easier to enter and analyze data when group names are entered as numbers, not words Ordinal scales = measurements that convey order or rank alone o E.g.: Finishing order in a competition, education level and rankings o Indicate that one value is greater than or less than another o Difference between ranks are unknown/not equal and have no meaning Interval scales = measurements that have no true zero and are distributed in equal units (equidistant) o True zero = when the value of 0 truly indicates nothing on a scale of measurement o E.g.: Rating scale, temperature scale in Celsius degrees (a temperature equal to zero does not mean that there is no temperature; it is just an arbitrary zero point) o Implication in not having a true zero = there is no outright value to indicate the absence of the phenomenon you are observing, so a zero proportion is not meaningful Ratio scales = measurements that have a true zero and are distributed in equal units; are the most informative scales of measurement o Counts and measures of length, height, weight and time o Order and differences are informative o It is meaningful to state that 60 pounds is twice as heavy as 30 pounds In blue or highlighted: what was said or emphasized by the lecturers We always first need to know the level of measurement in order to know which statistical techniques we may use for the given variable Nominal à ordinal à interval à ratio Qualitative variables à quantitative variables Types of variables for which data are measured Continuous variable = measured along a continuum at any place beyond the decimal point. Can be measured in fractional units Discrete variable = measured in whole units or categories that are not distributed along a continuum Quantitative variable = varies by amount; is measured numerically and is often collected by measuring or counting Qualitative variable = varies by class; is often represented as a label and describes nonnumeric aspects of phenomena à only discrete variables In blue or highlighted: what was said or emphasized by the lecturers Chapter 3 – Summarizing Data: Central Tendency Measures of central tendency = statistical measures for locating a single score that is most representative or descriptive of all scores in a distribution; are values at or near the center of a distribution o Although we lose some meaning anytime we reduce a set of data to a single score, statistical measures of central tendency ensure that the single score meaningfully represents a set of data o E.g.: mean, median, mode Population size = N Sample size = n Mean = arithmetic mean or average = balance point in a distribution; works for interval and ratio levels of measurement; its values shifts in a direction that balances a set of scores; it is the sum of (Σ) a set of scores (x) divided by the total number of scores summed, in either a sample (n) (sample mean) or a population (N) (population mean) o The mean can be misleading when a data set has an outlier because the mean will shift toward the value of that outliner. Outliers in a data set influence the value of the mean but not the median %& o Population mean: 𝜇 = ' %& o Sample mean: 𝑀 = ) o Changing an existing score will change the mean à if you increase the value of an existing score the mean will increase; if you decrease the value of an existing score the mean will decrease o Adding a new score or removing an existing score will change the mean, unless that value equals the mean § if the new score added is less than the previous mean, the mean will decrease; if the new score added is greater than the previous mean, the mean will increase § deleting a score below the mean will increase the value of the mean; deleting a score above the mean will decrease the value of the mean § the only time that a change in a distribution of scores does not change the value of the mean is when the value that is added or removed is exactly equal to the mean o Adding, subtracting, multiplying or dividing each score in a distribution by a constant will cause the mean to change by that constant o Summing the differences of scores from their mean equals 0 § When the mean is subtracted from each score (x), then summed, the solution is always zero § Σ(𝑥 − 𝑀) = 0 o The sum of the squared (SS) differences of scores from their mean is minimal à the smallest possible positive number greater than 0 § Σ(𝑥 − 𝑀)/ = 𝑚𝑖𝑛𝑖𝑚𝑎𝑙 Weighted mean (Mw) = the combined mean of two or more groups of scores in which the number of scores in each group is disproportionate or unequal; can In blue or highlighted: what was said or emphasized by the lecturers be used to compute the mean for multiple groups of scores when the size of each group is unequal (when some samples have more scores than others) Σ(𝑀 × 𝑛) 𝑀𝑤 = Σn o The weighted mean is larger than the arithmetic mean when the larger sample scored higher Median = the middle value in a distribution of data listed in numeric order; works for ordinal, interval and ratio levels of measurement 𝑛+1 𝑚𝑒𝑑𝑖𝑎𝑛 𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 = 2 o The median is not as sensitive to outliers as the mean o The 50th percentile of a cumulative percent distribution can be used to estimate the value of the median Mode = value in a data set that occurs most often or most frequent; is often reported in research journals with the mean or median; works for nominal, ordinal, interval and ratio levels of measurement Choosing an appropriate measure of central tendency The choice of which measure to select depends largely on the type of distribution and the scale of measurement of the data Normal distribution = symmetrical, Gaussian or bell-shaped distribution = is a theoretical distribution in which scores are symmetrically distributed above and below the mean, the median and the mode are at the center of the distribution The mean is most useful for describing (more or less) normally distributed variables and measures on an interval or ratio scale, because all scores are included in its calculation o Describing normal distributions = when your data is continuous. The mean, median and more are the same value located at the center of the distribution In blue or highlighted: what was said or emphasized by the lecturers o Describing interval and ratio scale data = data that can be described in terms of the distance that scores derivate from the mean. Differences between two scores are meaningfully conveyed for data on an interval or ratio scale only. The median is often used for interval/ratio that have skewed distributions and measures on an ordinal scale o Skewed distribution = distribution of scores that includes outliers or scores that fall substantially above or below most other scores in a data set o Positively skewed distribution = distribution of scores in which outliers are substantially larger (toward the right tail in a graph) than most other scores o Negatively skewed distribution = distribution of scores in which outliers are substantially smaller (toward the left tail in a graph) than most other scores o Describing skewed distributions = the value of the mean is pulled toward the skewed data points. In a positively skewed distribution, the In blue or highlighted: what was said or emphasized by the lecturers mean is greater than the mode; in a negatively skewed distribution, the mean is less than the mode. § The location of the median is unpredictable and can fall on any side of the mode, depending on how the scores are distributed § Outliers distort the value of the mean, making it a less meaningful measure for describing all data in a distribution. The median is not influenced by the value of the outliers and, therefore, is more representative of all data in a skewed distribution, being the most appropriate measure of central tendency to describe these types of distributions o Describing ordinal scale data = the median is used to describe ranked or ordinal data that convey direction only (the fifth person to finish a task took longer than the first person to finish a task), because the distance (deviation) of ordinal scale scores from their mean is not meaningful The mode can be used for nominal, ordinal or interval/ratio variables o Describing modal distributions Modal distributions = distribution of scores in which one or more scores occur most often or most frequently § Unimodal distribution = distribution of scores in which one score occurs most often or most frequently. Has one mode. E.g.: normal distributions (mode + mean) and skewed distribution (mode + median) § Bimodal distribution = distribution of scores in which two scores occur most often or most frequently. Has two modes. The mean and the median are typically located between the two modes. E.g.: data for two groups with unique characteristics combined (height of adult American men and women) § Multimodal distribution = is a distribution of scores where more then two scores occur most often or most frequently. Has more than two modes In blue or highlighted: what was said or emphasized by the lecturers § Nonmodal distribution = rectangular distribution = distribution of scores where all scores occur at the same frequency. Has no mode. The mean or median can be used to describe data. o Describing nominal scale data = the mode is used to describe nominal data that identify something or someone. Because a nominal scale value is not a quantity, it does not make sense to use the mean or median to describe this data. § Anytime you see phrases such as most often, typical or common, the mode is being used to describe these data Summarized: Measure of central tendency Shape of distribution Measurement scale Mean Normal Interval, ratio Median Skewed Ordinal Mode Modal Nominal Lecture 11/11/19 Statistics ≠ statistics Univariate: what was the average grade of the ISA exam last year? Bivariate: did males and females differ in their grades? Gender à grade Multivariate: was the grade dependent on initial motivation, the time spent on reading and gender? In blue or highlighted: what was said or emphasized by the lecturers Chapter 4 – Summarizing data: Variability Measuring Variability Variability = measure of the dispersion or spread of scores in a distribution and ranges from 0 to ¥ o To determine how dispersed scores are in a set of data Measures of variability 1. Range (ordinal, interval/ratio) = distance between highest and lowest score o Range = largest value – smallest value o Always reported together with maximum & minimum score (max – min) o Sensitive to outliers 2. Interquartile range; IQR (ordinal, interval/ratio) = based on quartiles that split our data into four equal groups of cases o IQR = upper quartile (75th percentile) – lower quartile (25th percentile) = 𝑄B − 𝑄C o Little influence of outliers, because IQR excludes the top and bottom 25% of scores in a distribution o The median for all data = median quartile (50th percentile) D FD o SIQR = E / G à limited estimate of variability, because it excludes half the scores in a distribution 3. Variance (interval/ratio) = is a measure of variability for the average squared distance that scores deviate from their mean; based on the sum of squares, e.g. the squared differences from the mean o Is a preferred measure of variability because all scores are included in its computation. Variance can be computed for data in populations and samples. o For the calculation of the variance, it matters whether we have sample data or population data (typically sample data). The definitional formula for variance: %(HFI)J %(&FL)J 𝑠 / = )FC or 𝜎 / = ' o The SS produces the smallest possible value for deviations of scores form their mean, different from 0 à it minimizes error. The SS is computed in the same way for sample variance and population variance o If a population has a variance of x, on average, any sample selected from this population should also have a variance of x à if not, the sample variance is biased o Why are there two different variance formulas for sample data/population data? § We often use the sample variance as an ‘estimator’ for the population variance (which is typically unknown) § When we calculate sample variance, we therefore divide by n-1, to arrive at an unbiased estimator of the population variance § This is particularly relevant in small samples o How can we interpret the value of the variance? In blue or highlighted: what was said or emphasized by the lecturers §We don’t, but: everything is meaningful in comparison: when comparing variances across groups, we can make comparative statements about more/less dispersion around the mean § For the purpose of interpretation, we calculate another measure of variability (which is typically unknown) o The computational formula for variance = the raw scores method for variance: (%&)J (%&)J § 𝑆𝑆 = Σ𝑥 / − ' or 𝑆𝑆 = Σ𝑥 / − )FC § Squaring decimals often requires rounding and the mean is often a decimal, resulting in rounding errors. Computational formula for variance is an alternative to solve this problem Fractiles = measures that divide a set of data into two or more equal parts. Fractiles include the median, quartiles, deciles and percentiles, which split data into 2 parts, 4 parts, 10 parts and 100 parts, respectively Deviation = the difference of each score from its mean o The sum of the deviation of scores from their mean is zero Biased estimator = any sample statistic, such as the sample variance when we divide SS by n, obtained from a randomly selected sample that does not equal the value of its respective population parameter Unbiased estimator = any sample statistic, such as the sample variance when we divide SS by n-1, obtained from a randomly selected sample that equals the value of its respective population parameter Degrees of freedom (df) for sample variance = the number of scores in a sample that are free to vary (when the sample mean is known). All scores except one are free to vary in a sample: n-1 The deviation of the first score from its mean is -1 (3 - 4 = -1 ), and the deviation of the second score from its mean is 0 (4- 4 = 0). Therefore, it must be the case that -1 + 0 + (x- 4) = 0. Thus, x = 5 because it is the only value for x that can make the solution to this equation equal to 0. The standard deviation The standard deviation = the root mean square deviation = Is a measure of variability for the average distance that scores deviate from their mean. It is calculated by taking the square root of the variance o interval/ratio = is the square root of the variance; an appropriate measure of the average distance to the mean o Calculating SD for sample data Σ(x − M)/ 𝑠= N 𝑛−1 o Calculating SD for population data In blue or highlighted: what was said or emphasized by the lecturers Σ(𝑥 − 𝜇/ ) 𝜎= N 𝑁 When scores are concentrated near the mean = SD is smaller When scores are scattered far from the mean = SD is larger For normally distributed variables = we can use the SD to make statements about the distribution à empirical rule o At least 68% of all scores lie within one SD of the mean o At least 95% of all scores lie within two SD of the mean o At least 99.7% of all scores lie within three SD of the mean The SD is always positive: SD≥0 The SD is used to describe quantitative data The SD is most informative when reported with the mean The value for the standard deviation is affected by the value of each score in a distribution Adding or subtracting the same constant to each score will not change the distance that scores deviate from the mean. Hence, the SD remains unchanged Multiplying or dividing each score using the same constant will cause the SD to change by that constant In blue or highlighted: what was said or emphasized by the lecturers Chapter 6 – Probability, Normal Distributions and z Scores The normal distribution in behavioral science The behavioral data that researchers measure often tend to approximate a normal distribution (in which scores are symmetrically distributed above and below the mean, the median and the more at the center of the distribution) Characteristics of the normal distribution Is mathematically defined o Rarely do behavioral data fall exactly within the limits of the mathematical formula. When we say that data are normally distributed, we mean that the data approximate a normal distribution. The normal distribution is so exact that it is impractical to think that behavior can fit exactly within the limits defined by the formula. Is theoretical o Data can be normally distributed in theory à approximation The mean, the median and mode are all located at the 50th percentile o Half of the data fall above the mean, the median and the mode and half fall below Is symmetrical o The distribution of data above and below the mean are the same The mean can equal any value o -¥ ≤ M ≤ +¥ The SD can equal any positive value greater than 0 (or = 0 à data doesn’t vary) The total area under the curve of a normal distribution is equal to 1.0 o Proportions of area under a normal curve are used to determine the probabilities for normally distributed data The tails of a normal distribution are asymptotic o the tails of a normal distribution never touch the x-axis, so it is possible to observe outliers in a data set that is normally distributed The standard normal distribution Standard normal distribution = z distribution = a normal distribution with a mean equal to 0 and a standard normal deviation equal to 1. The standard normal distribution is distributed in z score units along the x-axis In blue or highlighted: what was said or emphasized by the lecturers o Used to determine the probabilities of a certain outcome in relation to all other outcomes § To estimate probabilities under the normal curve, we determine the probability of a certain outcome in relation to all other outcomes Z score = a value on the x-axis of a standard normal distribution. The numerical value of a z score specifies the distance of the number of standard deviations that a value is above or below the mean Standard normal transformation = the z transformation = a formula used to convert any normal distribution with any mean and any variance to a standard normal distribution with a mean equal to 0 and a standard deviation equal to 1 o To locate where a score in any normal distribution would be in the standard normal distribution o The mean in any normal distribution corresponds to a z score = 0 &FL &FT 𝑧= S for a population of scores, or 𝑧 = UV for a sample of scores The unit normal table: a brief introduction the unit normal table = z table = type of probability distribution table displaying a list of z scores and the corresponding probabilities (or proportions of area) associated with each z score listed (page 655) o column A lists the positive z scores. For the negative z scores below the mean, the normal distribution is symmetrical. Scores are listed from z=0 at the mean to z=4 above the mean o column B lists the area between a z score and the mean. The first value for the area listed in column B is.0000, which is the area between the mean (z=0) and the mean. As the z score moves away from the mean, In blue or highlighted: what was said or emphasized by the lecturers the proportion of area between that score and the mean increases closer to.5000, or the total area above the mean o column C lists the area from a z score toward the tail. The first value is.5000, which is the total area above the mean. As a z score increases and therefore moves closer to the tail, the area between that score and the tail decreases closer to.0000 Locating proportions We can use the unit normal table to locate the proportion (and therefore the probability) of a score in a normal distribution 1. Transform a raw score (x) into a z score 2. Locate the corresponding proportion for the z score in the unit normal table Locating proportions above the mean o To locate the proportion, look in Column A in the unit normal table for a z score equal to the one measured in step 1. The proportion toward the tail is listed in column C Locating proportions below the mean o To locate the proportion, look in Column A in the unit normal table for a z score equal to the one measured, but positive (a proportion given for a positive z score will be the same for a corresponding negative z score). The proportion for a z score toward the lower tail is listen in column C Locating proportions between two values o The proportion between the mean and a z score of 1 is the same as that for -1. The total proportion between the value below the mean and the value above the mean is the sum of the proportion for each score Locating scores Finding scores in a given percentile can be useful in certain situations See examples in the book p. 182 In blue or highlighted: what was said or emphasized by the lecturers Chapter 7 – probability and sampling distributions Selecting samples from populations Inferential statistics and sampling distributions o Sampling distribution = a distribution of all sample means or sample variances that could be obtained in samples of a given size from the same population o Researchers use the sample statistics they measure in a sample to make inferences about the characteristics, or population parameters, in a population of interest Sampling and conditional probabilities o Random procedure to select a sample to avoid bias o Sampling without replacement = most common sampling method in behavioral research = each participant or item selected is not replaced before the next selection. The probability of each selection are not the same, are conditional § First draw: p = square market A/total number of squares § Second draw: p = remaining square marked A/total number of squares – 1 o Sampling with replacement = each participant or item selected is replaced before the next selection, to ensure that each individual or item has the same probability of being selected. This method is used in the development of statistical theory § It is typically not necessary to use this method in behavioral research because the populations of interest are large Selecting a sample: who’s in and who’s out? Sample design = specific plan or protocol for how individuals will be selected or sampled from a population of interest Does the order of participants matter? à determines how often people in a population can be selected Do we replace each selection before the next draw? In blue or highlighted: what was said or emphasized by the lecturers Sampling strategy: the basis for statistical theory o Theoretical sampling is used in the development of the theories that have led to statistics as a branch of mathematics o In theoretical sampling, the order of selecting individuals matters, and each individual selected is replaced before sampling again à with replacement o Total number of samples possible = Nn Sampling strategy: most used in behavioral research o For most studies in behavioral science, order does not matter because we do not care about the order in which participants are selected o In experimental sampling the order of selecting individuals does not matter, and each individual selected is not replaced before selecting again '! o Total number of samples possible = )!('F))! Sampling distributions: the mean The population mean (𝜇) is computed by summing all scores in the population, then dividing by the population size The average sample mean is (𝜇 M) is computed by dividing the sum of the sample means (ΣM) by the total number of samples summed. The sample mean is an unbiased estimator of the value of the population mean, follows the central limit theorem and has a minimum variance Central limit theorem = explains that regardless of the distribution of scores in a population, the sampling distribution of sample means selected at random from that population will approach the shape of a normal distribution, as the number of samples in the sampling distribution increases à hypothetical distribution o At least 95% of all possible sample means we could select from a population are within two standard deviations of the population mean A distribution of a sample means has a minimum variance o Has a different SD than the population distribution o Variance of the sampling distribution of the sample means § M = sample mean in each possible sample § 𝜇T = is the mean of the sampling distribution (equals the population mean) § Nn = the total number of possible samples that can be selected o The standard error of the mean = the standard deviation of a sampling distribution of means. It is the standard error or distance that sample mean values deviate from the value of the population mean § Tell us how far possible sample means deviate from the value of the population mean § Will be skinner than the normal distribution of variable in the population S § Take the square root of the variance à 𝜎T = √) Sampling distributions: the variance In blue or highlighted: what was said or emphasized by the lecturers Sampling distribution of the variance to determine how the sample variances we could select from the population compare to the value of the population variance calculated Unbiased estimator o A sample variance is an unbiased estimator when the sample variance we obtain in a randomly selected sample equals the value of the population variance on average o The mean of the sampling distribution of the sample variances is the sum of the sample variances we could select divided by the total number of samples summed o On average, the sample variance is equal to the population variance when we divide SS by df (degree of freedom) (df = n-1). This makes the sample variance an unbiased estimator of the population variance. UU UU s2= 𝜎2 when 𝑠 / = )FC 𝑜𝑟 = Z[ o If we divided SS by n, the result would be that we underestimate the population variance on average, which would make the sample variance a biased estimator of the population variance Skewed distribution rule o Regardless of the distribution of scores in a population, the sampling distribution of sample variances selected at random from that population will approach the shape of a positively skewed distribution, as the number of samples in the sampling distribution increases No minimum variance o Although the sample variance equals the population variance on average, the distribution of all other sample variances can vary far from the population variance when we divide SS by df o The distribution of sample variances is minimal only when we divide SS by n. But it’s better for the sample variance to be unbiased (because the primary use of sample statistics is to estimate the value of population parameters and this estimate must be unbiased) than for a distribution of sample variances to vary minimally from the population variance. For this reason, the sample variance is calculated by dividing SS by df. The standard error of the mean Variance of the sampling distribution of sample means The standard error of the mean = a numeric measure of sampling error, with larger values indicating greater sampling error or greater differences that can exist from one sample to the next SJ 𝜎T = \ ) (this formula is not very important. Is the step before standard error) Sampling error = the extent to which sample means selected from the same population differ from one another. This difference, which occurs by chance, is measured by the standard error of the mean Factors that decrease standard error In blue or highlighted: what was said or emphasized by the lecturers The larger the standard deviation in the population, the larger the standard error (bigger deviation). If the standard deviation is low, it means that the sample is more stable The law of large numbers explains that the larger the sample size, the smaller the standard error Standard normal transformation with sampling distributions Z transformation for sampling distribution (distribution of means) (transform sampling distribution to standard normal distribution) à use this formula when the exercise talks about sample means 𝑀 − 𝜇T 𝑧= 𝜎T The z transformation is used to determine the likelihood of measuring a particular sample mean, from a population with a given mean and variance To locate the proportion and the probability of selecting a sample mean in any sampling distribution, we (1) transform a sample mean into a z score, and then (2) locate the corresponding proportion for the z score in the unit normal table In blue or highlighted: what was said or emphasized by the lecturers Chapter 8 – Hypothesis testing: significance, effect size and power Inferential statistics and hypothesis testing Inferential statistics allows us to observe samples to learn more about the behavior in populations that are often too large or inaccessible to observe We use samples because we know how they are related to populations. The sample mean in an unbiased estimator of the population mean. On the basis of the central limit theorem, we know that the probability of selecting any sample mean value from the same population is normally distributed We expect the sample mean to be equal to the population mean Principles of hypothesis testing Significance testing = We test a hypothesis about a parameter of a population (mean), using sample data measured in a sample. Hypothesis are therefore always about populations, never about samples. Test of significance are used to determine the likelihood that a hypothesis about a population parameter (mean) is true The null hypothesis (H0) = hypothesis of equality; is a statement about a population parameter, such as the population mean, that is assumed to be true o We always start by formulating a null hypothesis about a population parameter (the mean) o Very often, the H0 is not what the researcher expects: the only reason we are testing the H0 is because we think it is wrong o Typically, we therefore try to reject the H0 o The opposing hypothesis is the H0 is called the alternative hypothesis (H1), also called the research hypothesis = is a statement that directly contradicts a null hypothesis by stating that the actual value of a population parameter is less than, greater than, or not equal to the value stated in the null hypothesis o We can never prove our alternative/research hypothesis (H1) about the population directly. We can only show that the H0 is unlikely to be true o In significance testing, we first assume H0 to be true and then see how (un)likely it is, given the sample data that we just obtained o If H0 turns out to be very unlikely, we reject it, and therefore accept H1 How unlikely is unlikely enough to reject H0? What threshold to use? o We call this threshold the “level of significance” (alpha) = based on the probability of obtaining a statistic measured in a sample if the value stated in the H0 were true o In social sciences, we typically use a 0.05 level of significance: alpha= 5% à because regardless of the distribution in a given population, the sampling distribution of the sample mean is approximately normal. Hence, the probabilities of all other possible sample means we could select are normally distributed o Meaning: we make sure that the probability of falsely rejecting a true null hypothesis (type 1 error) is not greater than 5%; Or the chances of the null hypothesis being true, despite the data we have, is no greater than 5% In blue or highlighted: what was said or emphasized by the lecturers Level of significance: Type 1 error = Probability that you are going to make a mistake and accept H1 when you should have accepted H0 Type 2 error = is the probability of retaining a null hypothesis that is actually false Type 3 error = when we fail to reject a H0 because we placed the rejection region in the wrong tail Power in hypothesis testing = the probability of rejecting a false null hypothesis. Specifically, it is the probability that a randomly selected sample will show the null hypothesis is false when the null hypothesis is indeed false 4 steps in significant testing (when done by hand) State H0 and H1 Set the level of significance/alpha = 5% Compute the test statistic o One-independent sample z-test 𝑀−𝜇 𝑧]^_`a)bZ = 𝜎 √𝑛 o One-independent sample t-test 𝑀−𝜇 𝑡]^_`a)bZ = ˆ𝜎 √𝑛 Make a decision to reject/accept H0 (decisions are about H0) o Compare the calculated test statistic (obtained value) with a critical value found in a table § Critical value = the cutoff value that defines the boundaries beyond which less than 5% of sample means can be obtained if the null hypothesis is true. Sample means obtained beyond a critical value will result in a decision to reject the null hypothesis § Obtained value = the value of a test statistic. When obtained value > critical value, we reject H0 § Z-critical (rejection region) = 1.96 for two-tailed H1 (alpha=2,5%) § Z-critical (rejection region) = 1.65 for one-tailed H1 (alpha = 5%) In blue or highlighted: what was said or emphasized by the lecturers o Rejection region = region beyond a critical value in a hypothesis testing. When the value of a test statistic is in the rejection region, we decide to reject the null hypothesis; otherwise, we retain the H0 State your conclusion referring to the population Population Sample Distribution of distribution distribution sample means What is it? Scores of all Scores of a select All possible sample persons in a portion of persons means that can be population from the selected, given a population certain sample size Is it accessible? Typically no Yes Yes What is the shape? Could be any shape Could be any shape Normal distribution Idea behind the one-independent sample z-test We use z-test when the standard deviation is known Independent = when sampling data, each case is independent of the other, no direct affect between data; no causality between one person to the other Non-directional à 2 rejection regions (2,5% level of significance each) If the z value is higher, it is more difficult to reject H0 à we retain H0 In blue or highlighted: what was said or emphasized by the lecturers Chapter 9 - Testing means: one-sample and two-independent-sample t Tests One-independent sample t-test Three assumptions for a one-sample t test 1. Normality: data in the population being sampled are normally distributed. In larger samples (n>30), SE is smaller and this assumption becomes less critical as a result 2. Random sampling 3. Independence: one outcome does not influence another. Using random sampling usually satisfies this assumption We use t-test when the standard deviation is unknown 𝑀−𝜇 𝑡]^_`a)bZ = ˆ𝜎 √𝑛 ^ means estimate of SD As the estimate of sigma, we use the sample SD o Estimated standard error = estimate of SD of a sampling distribution of sample means selected from a population with an unknown variance. it is an estimate of the SE or SD that sample means deviate from the value of the population mean stated in the null hypothesis Σ(x − M)/ ˆ𝜎 = N 𝑛−1 Follow the same steps as for z-test Know the “degrees of freedom” (df) o For the one-sample t-test: df = n-1 In blue or highlighted: what was said or emphasized by the lecturers o As sample size increases, df also increase If obtained value > critical value, we reject H0 The larger the sample size, the more closely a t distribution estimates a normal distribution o There is a greater probability of obtaining sample means that are farther from the value stated in H0 in small samples. As sample size increases, obtaining sample means that are farther from the value stated in the null hypothesis becomes less likely. The result is that critical values get smaller as sample size increases T-critical (rejection region) = 1.96 for two-tailed H1 (alpha = 2,5%) T-critical (rejection region) = (±)1.65 for one-tailed H1 (alpha = 5%) t-test in SPSS The output will not tell us directly whether or not to reject H0 SPSS calculates a p-value = the probability of obtaining a difference (between the sample mean and the population value we tested it against) at least as large as the one that was obtained under the assumption that H0 is true; p- value is the probability of falsely rejecting H0 à type 1 error If p-value ≥ alpha (0.05) we retain H0 If p-value < alpha (0.05), we reject H0 and retain H1 When p is low, H0 has to go SPSS always assumes a non-direction H1 and thus gives us the p-value for an assumed two-tailed rejection region. If our H1 is direction, we divide the p- value by two before comparing it to our alpha The p-value (Sig.) is the region past the t obtained into the tail. As t gets larger, the region gets smaller and so p-value becomes smaller The same t-value (obtained) for a two-tailed H1 (non-directional) is more likely to reject the H0 under a directional H1 (assuming M is in the correct direction) Two-independent-sample t-test Independent samples = are the selection of participants, such that different participants are observed one time in each sample or group Two-independent-sample t test = is a statistical procedure used to compare the mean difference between two independent groups. This test is specifically used to test hypotheses concerning the difference between two population means, where the variance in one or both populations in unknown In terms of the null hypothesis, we state the mean difference that we expect in the population and compare it to the difference we observe between the two sample means in our sample o 𝜇C = 𝜇/ (𝑜𝑟 𝜇C − 𝜇/ = 0) H1 o 𝜇C ≠ 𝜇/ 𝑜𝑟 𝜇C >< 𝜇/ Four assumptions: 1. Normality: data in the population being sampled are normally distributed. In larger samples (n>30), SE is smaller and this assumption becomes less critical as a result 2. Random sampling In blue or highlighted: what was said or emphasized by the lecturers 3. Independence: one outcome does not influence another. Using random sampling usually satisfies this assumption 4. Equal variance: the variances in each population are equal to each other. This assumption is usually satisfied when the larger sample variance is not greater than 2 times the smaller df = sum of the dfs = df 1 + df 2 estimated standard error for the difference = estimate of the SD of a sampling distribution of mean differences between two sample means. it is an estimate of the SE or SD that mean differences can be expected to deviate from the mean difference stated in the null hypothesis. The higher the SE, the more likely it to retain H0, because more difficult it is to generalize it. à that’s all we need to know. We don’t need the formulas below for now. 𝑠/ g 𝑠/ g 𝑠TG FTf = N + 𝑛C 𝑛/ o Pooled sample variance = the mean sample variance of two samples. When the sample size is unequal, the variance in each group or sample is weighted by its respective degrees of freedom. The larger n is, the better the estimate of sample variance will be 𝑠 / C (𝑑𝑓C ) + 𝑠 / / (𝑑𝑓/ ) 𝑠/ g = (𝑑𝑓C ) + (𝑑𝑓/ ) o When we have equal sample sizes, we do not have to weight each sample variance by its respective df 𝑠/C + 𝑠/ / 𝑠/ g = 2 T static for a two-independent-sample t test (TG FTJ )F(LG FLJ) o 𝑡]^_`a)bZ = ijG kjJ The Levene’s test for equality of variances The math behind the independent sample t-test is different depending on if we can assume the sigma of the two groups to be equal. In order to make this decision, we first need the Levene’s test for equality o Is concerned with the dispersion, variance of populations No directionality in H1 𝐻0mbna)b: 𝜎 /pq]rgC = 𝜎 /pq]rg/ 𝐻1mbna)b: 𝜎 /pq]rgC ≠ 𝜎 /pq]rg/ For the Levene’s test, SPSS calculates a p-value: the chance of falsely rejecting the H0 of that tests (the two population variances are equal) If p-value ≥ alpha (0.05) we retain H0 If p-value < alpha (0.05), we reject H0 and retain H1 Once we know whether or not to assume equal variances in the population, we can look at the p-value of the actual t-test and reject or accept its H0 (i.e. that population means are equal) The Levene’s test is not the final test that leads you to your final conclusion If you don’t reject H0, it means equal variances assumed à first row Then turn to p-value of t-test in the “equal variances assumed” row and compare it to alpha In blue or highlighted: what was said or emphasized by the lecturers Chapter 11 – Estimation and confidence intervals Point estimation and interval estimation Estimation = statistical procedure in which a sample statistic is used to estimate the value of an unknown population parameter. We use estimation to measure the mean or mean difference in a sample, but instead of making a decision regarding a null hypothesis, we estimate the limits within which the population mean or mean difference is likely to be contained. Two types of estimation are point estimation and interval estimation Point estimation = is the use of a sample statistic (e.g. sample mean) to estimate the value of a population parameter (e.g. population mean) – sample à population o Is an unbiased estimator: the sample mean will equal the population mean on average o We have no way of knowing for sure whether a sample mean equals the population mean Interval estimation = is a statistical procedure in which a sample of data is used to find the interval or range of possible values within which an unknown population parameter (mean) is likely to be contained Confidence interval (CI) = is the interval or range of possible values within which an unknown population parameter is likely to be contained. Sample means always are some unknown distance away from the population mean, which means the sample mean does not accurately reflect the population mean. So, based in the sample mean, we want to estimate in what general area the population mean is likely to be o Final statement will look something like this: we can conclude, with 95% confidence, that the population mean for hour studying for ISA lies between 17.06 and 22.94 Level of confidence = the probability or likelihood that an interval estimate will contain an unknown population parameter (e.g. population mean) o Confidence limits/boundaries = upper and lower boundaries of a confidence interval given within a specified level of confidence. 53% ± 3% believe in ghosts à if we add or subtract 3% from each point estimate, we find that we can be confident that, on average, 50% to 56% of Americans believe in ghosts o If we want to be 95% confident, we use the critical values of z for the 0.05 level (two-tailed, 2.5% on both sides – thus looking for 0.025 in column C) Level of confidence Level of significance (alpha level, two-tailed) 99% 0.01 95% 0.05 90% 0.10 80% 0.20 Point estimate Interval estimate What is it estimating? The population mean The population mean In blue or highlighted: what was said or emphasized by the lecturers How is the population The sample mean is used A range of values is used mean estimated? to estimate the within which the population mean population mean is likely to be contained How precise is it? Very precise; it states an Less precise; it identifies a exact estimate of the range of means, any one population mean of which could be equal to the population mean The process of estimation To find the interval estimate we compute the estimated SD and then we add and subtract this value from the point estimate (e.g. sample mean) 𝜎 𝑠T = √𝑛 Using the empirical rule, we know that at least 68% of all possible sample means fall within the first SE of the mean. If we add and subtract one SE from the sample mean, we find that the true population mean is between 3.5 and 4.5 at a 68% level of confidence – this is the level of confidence because at least 68% of all possible sample means are contained with this range or interval How to estimate a population mean using point estimate and interval estimate: 1. Compute the sample mean and SE 2. Choose the level of confidence and find the critical values at that level of confidence 3. Compute the estimation formula to find the confidence limits Population mean = point estimate ± interval estimate In blue or highlighted: what was said or emphasized by the lecturers Estimation for the one-sample z-test When we know the population variance, we can compute the one-sample z test An alternative approach is to use estimation to estimate the value of a population mean. We use the estimation formula for a one-sample z test to identify the confidence limits within which the true population mean is likely to be contained S The estimation formula when SD is known is: 𝑀 ± 𝑧(𝜎T ) (à 𝑆𝐸 = 𝜎T = ) √) o Typically we don’t know the SD of the population and have to use the critical values of t and use sigma hat (estimation of SD in population) instead of sigma ˆS The estimation formula when SD is unknown is: 𝑀 ± 𝑡(𝜎T ) (à 𝑆𝐸 = 𝜎T = ) √) %(HFI)J à ˆ𝜎 = \ )FC If we need to be more confidence about our statements, find appropriate t- value for this level of confidence (t-value tend to be bigger) In blue or highlighted: what was said or emphasized by the lecturers Chapter 12 – Analysis of Variance: one-way between-subjects design Increasing k: a shift to analyzing variance The levels of the factor (k) = the number of groups or different ways in which an independent or quasi- independent variable is observed. An introduction to analysis of variance A one-way between-subjects ANOVA is a statistical procedure used to test hypotheses for one factor with two or more levels concerning the variance among group means. This test is used when different participants are observed at each level of a factor and the variance in any population is unknown. o If the means in each group significantly vary o The larger the differences are between group means, the larger the variance of group means will be A between-subjects-design is a research design in which we select independent samples, meaning that different participants are observed at each level of a factor. With the F-test, we investigate to what extent the categories of an independent ordinal/nominal variable help us to explain the variation of a dependent interval/ratio variable o The dependent variable is an interval/ratio variable ANOVA (for interval or ratio) = is a statistical procedure used to test hypotheses for one or more factors concerning the variance among two or more group means (k≥2), where the variance in one or more populations is unknown; is an extension of the independent samples t-test and it can be used for comparing more than two groups The F-test in the main test of significance of the ANOVA and used when comparing two or more groups The H0 for the F-test states that in the population all group means are equal: o H0 = 𝜇C = 𝜇/ = 𝜇B = ⋯ = 𝜇w o In the population the means of happiness are equal across all the groups The H1 states that in the population, at least one group mean differs from the other group(s) mean(s). Note: the H1 us always non-directional (as opposed to H1 for independent sample t-test) o H1: there are differences across groups on 𝜇 Once we have rejected H0, we need a post-hoc test that tells us which group(s) differ significantly from the rest Two ways to select independent samples Select a sample from two or more populations o Used for the quasi-experimental research method (which is a research design that does not have a comparison group and/or includes a factor that is preexisting – cannot be manipulated/changed) In blue or highlighted: what was said or emphasized by the lecturers Select one sample from the same population and randomly assign participants in the sample to two or more groups o Used for experimental research method (which is a research design that includes randomization, manipulation, and a control or comparison group) o The only way to achieve an experiment using the between-subjects design is to randomly assign participants selected from a single population to different groups For ANOVA: n = number of participants per group / N = number of total participants in a study o When n is the same in each group: k x n = N Sources of variation and the test statistic Between-groups variation = the variation attributed to mean differences between groups. A source of variation = any variation that can be measured in a study. In the one-way between-subjects ANOVA, there are two sources of variation: variation attributed to differences between group means and variation attributed to error. Within-groups variation = the variation attributed to mean differences within each group. This source of variation cannot be attributed to or caused by having different groups and is therefore called error variation. An F distribution = a positively skewed distribution derived from a sampling distribution of F ratios. In analysis of variance, we distinguish between three different types of variances o Total variance: how much all x vary around the grand mean M o Variance between groups MSB: how much the groups means (M1, M2, Mk) vary around the grand mean M o Variance within groups MSW: how much all the x vary around their respective groups means In blue or highlighted: what was said or emphasized by the lecturers As the value of the test statistic increases, the likelihood of rejecting H0 increases, i.e. larger value indicate less probable H0’s, and also p-value decreases TU The value of F is defined as: 𝐹 = TU y à important for theory questions z o the value of F gets larger (making it easier to reject H0) in situations where (either or both) § group means differ more (i.e. large between-group variance, the larger MSB) § The groups are more homogeneous (i.e. less within-group variance, the smaller MSW) Degrees of freedom The critical value for an ANOVA is the cutoff value for the rejection region Two sources of variation for the one-way between-subjects ANOVA = two df o The degrees of freedom between groups (dfBG) = are the degrees of freedom associated with the variance of the group means in the numerator of the test statistic. They are equal to the number of groups (k) minus 1. à df = k – 1 o The degrees of freedom error (dfW), degrees of freedom within groups, = are the degrees of freedom associated with the error variance in the denominator. They are equal to the total sample size (N) minus the number of groups (k). à df = N – k The one-way between-subjects ANOVA We compute this test when we compare two or more group means, in which different participants are observed in each group Post Hoc Once we reject the H0 of the F-test, we need to conduct a post-hoc test to determine which means differ significantly from each other o Scheffé’s test § Is similar to two-samples t-test: it runs through all possible pairs of group means and examines their means differences § For each pair of group means the H0 of the Scheffé’s test states that their population means are equal § For each pair of groups, the post-hoc test tells us: The sample mean difference (first group mean MINUS second group mean) Whether this difference is statistically significant (at the 0,05 level) In blue or highlighted: what was said or emphasized by the lecturers Chapter 15 – Correlation The structure of a correlational design Correlation = a statistical procedure used to describe the strength and direction of the linear relationship between two factors o The statistics used to measure correlations = correlation coefficients (r) Describing a correlation o A correlation can be used to § Describe the pattern of data points for the values of two factors (described by the direction and strength of the relationship between two factors = correlation coefficient) § Determine whether the pattern observed in a sample is also present in the population from which the sample was selected o Scatter plot = scatter gram = graphical display of discrete data points (x, y) used to summarize the relationship between two (interval/ratio) variables. Pairs of values for x and y are called data points (or bivariate plots). o In a scatter plot, the independent variable (x) is always put in the horizontal axis o Each dot represents a case and its position reflects the case’s value for x and y o Provides an intuitive, straightforward way of examining the relationship between two interval/ratio variable o We need a measure of association to express the strength & direction of the relationship in a single, easily interpretable number: the correlation coefficient Pearson’s r The direction of a correlation o Correlation coefficient (r) is used to measure the strength and direction of the linear relationship or correlation between two factors. The value of r ranges from -1.0 to +1.0. Values closer to ±1.0 indicate a strong correlation. The sign of the correlation (+ or -) indicates the direction of the correlation o A positive correlation (0 < r ≤ + 1.0) is a positive value of r that indicates that the values of two factors change in the same direction: as the values of one factor increase, the values of the second factor also increase. o A negative correlation (-1.0 ≤ r < 0) is a negative value of r that indicates that the values of two factors change in different directions, meaning that as the values of one factor increase, the values of the second factor decrease The strength of a correlation o The closer a correlation coefficient is to r = 0, the weaker the correlation and the less likely that two factors are related o The closer a correlation coefficient is to r = ±1, the stronger the correlation and the more likely that two factors are related o Regression line = the best-fitting straight line to a set of data points. A best-fitting line is the line that minimizes the distance of all data points In blue or highlighted: what was said or emphasized by the lecturers that fall from it. Scores are more consistent the closer they fall to a regression line Pearson correlation coefficient Used to determine the strength and direction of the relationship between two factors on an interval or ratio scale of measurement Each score should be transformed into a z score Pearson’s r formula {]n`qa`){b | `)Z } UU&€ 𝑟= = i~ ×i UU&UU€ o First calculate standard deviation of x and y Σ(x − M)/ 𝑠= N 𝑛−1 Pearson’s r = a measure of association for interval/ratio variables Covariance = the extent to which the values of two factors (x and y) vary together. The closer the data points fall to the regression line, the more that the values of two factors vary together Σ(𝑥 − 𝑀𝑥 ) × (𝑦 − 𝑀𝑦) 𝑐𝑜𝑣 𝑥𝑦 = 𝑛−1 Assumptions of tests for linear correlations 1. Linearity Assumption that the best way to describe a pattern of data is using a straight line Pearson’s correlation coefficient r assumes linear relationships (as do most standard statistical techniques). However, the social world is not always a linear place o Be critical about the methods you use o Explore and know your data (scatterplots) o Know your theory / previous research 2. Homoscedasticity Equal variance of data points dispersed along the regression line; Equal variance of y for different values of x Graphically: points more or less equally scattered around correlation/regression line Violation of homoscedasticity = heteroscedasticity o Higher values of x result in more variance on y In blue or highlighted: what was said or emphasized by the lecturers 3. Normality Pearson’s correlation coefficient r assumes normality: data points are normally distributed o Normal distribution of x o Normal distribution of y o Normal distribution of y for each value of x and vice versa Limitations in interpretation: causality, outliers and restriction of range Causality o A significant correlation shows the direction and the strength of the relationship between two factors o Reverse causality = occurs when the direction of causality for two factors cannot be determined. Hence, changes in factor A could cause changes in factor B, or vice versa; problem that arises when the direction of causality between two factors can be in either direction o Cofound variable = third variable = has an effect on both original variables; is an unanticipated variable not accounted for in a research study that could be causing or associated with observed changes in one or more measured variables o Coffee example (lecture): drinking coffee may prevent depression, scientists say § Direction of causality is unclear because the two variables were measured at the same time o Three preconditions for making a causal claim § Empirical evidence for a relationship between the variables § Temporal sequence: x occurs before change in (or effect on) y occurs § Causality claim supported by reason & theory (can’t sound too crazy) Outliers o Outlier = score that falls substantially above or below most other scores in a data set; can change the strength and direction of a correlation coefficient Restriction of range o Restriction of range = a problem that arises when the range of data for one or both correlated factors in a sample is limited or restricted, compared to the range of data in the population from which the sample was selected o To avoid a problem of restriction range, the direction and strength of a significant correlation should only be generalized to a population within the limited range of measurements observed in the sample Bivariate statistics Independent variable = variable that we expect to influence another variable in the model (x) Dependent variable = variable that we expect to be influenced by at least one (independent) variable in the model (y) In blue or highlighted: what was said or emphasized by the lecturers Chapter 16 – Linear regression and multiple regression From relationships to predictions Remembering: the correlation coefficient r can be used to measure the extent to which two factors (x and y) were related. The value of r indicates the direction and strength of a correlation. When r is negative, two factors change in opposite directions; when r is positive two factors change in the same direction. The closer r is to ±1.0, the stronger the correlation and the more closely two factors are related Linear regression = statistical procedure used to determine the equation of a regression line (straight line) to a set of data points and to determine the extent to which the regression equation can be used to predict values of one factor, given known values of a second factor in a population Fundamentals of linear regression Predictor variable or known variable (X) = the variable with values that are known and can be used to predict values of another variable. Criterion variable or to-be-predicted variable (Y) = the variable with unknown values that can be predicted or estimated, given known values of the predictor variable. Linear regression is used to predict values of Y (criterion variable) given values of X (predictor variable) Assumptions of regression analysis Linearity o The type of regression analysis (OLS) we have discussed assumes linear relationships between the IVs and the DV (just like Pearson’s r) o In cases of non-linear relationships, the model will not fit the data well (e.g. high prediction errors, low R2) Lack of multicollinearity o There should be no strong correlation between any two IVs (i.e. no multicollinearity) o General rule: no two IVs should correlate (to one another) stronger than r = 0.80 o Diagnostics available in SPSS that test for multicollinearity (e.g. VIF) o Multicollinearity is problematic, because it can lead to: § Incorrectly high (inflated) p-values of t-tests, i.e. we would conclude that a given effect is not significant even though H0 should have been rejected (type II – error) § An underestimation of R2 § Unreliable coefficients o Typically, fix by removing, from the model, one or more of the variables causing multicollinearity Homoscedasticity In blue or highlighted: what was said or emphasized by the lecturers o With homoscedasticity, we assume the variance of the residuals to be constant for all values of the IVs o In other words: the accuracy of our predictions should not depend on the value of one or more IV (similar to Pearson’s r) Three questions answered by linear regression and the techniques used to answer each question: What makes the regression line the best-fitting line? In simple regression (one IV), we try to predict Y (DV) with X (IV) We do this by trying to express Y as a linear transformation of X The regression line represents the predicted values of Y These predicted values do not always correspond to the actual values of Y o Prediction error à distance between actual value and predicted one The criterion used to determine the equation of a regression line is the sum of squares (SS), or the sum of the squared distances of data points from a straight line. The line associated with the smallest total value for SS is the best-fitting straight line = regression line The method of the least squares is used to square the distance that each data point falls from the regression line and sum the squared distances in order to determine the line associated with the least squares, or the smallest possible value of SS SS is calculated by squaring all prediction errors (differences between predicted values and actual values of Y) and adding up those squared differences The slope and y-intercept of a straight line Equation of a (estimation of, hence the hat ˆ) straight line à ˆY = bX + a Slope (b) measure how Y change as X increases, and y-intercept (a) is the value of Y when X=0 o b (or b1) = the slope of a straight line; a measure of how much a regression line rises or declines along the y-axis as values on the x-axis increase. Indicates the direction of a relationship between two factors, x and y. When values of Y increase as values of X increase, the slope is positive. When values of Y decrease as values of X increase, the slope is negative § also called the unstandardized regression coefficient § the slope of a straight line is used to measure the change in Y in relative to the change in X {…`)pb a) } slope (b) = {…`)pb a) | In blue or highlighted: what was said or emphasized by the lecturers o a (or b0) = the y-intercept (where the line crosses the y-axis). The value of Y when X=0 § SPSS = constant We do this by trying to express Y as a linear transformation of X The regression line represents the predicted values of Y Introduction to multiple regression Often the changes in a single predictor variable do not allow us to accurately predict changes in a criterion variable. So, our predictions of many behaviors improve when we consider more information (more predictor variables) Multiple regression = statistical procedure that includes two or more predictor variables in the equation of a regression line to predict changes in a criterion variable To accommodate more predictor variables in the equation of a regression line we add the slope, b, and the predictor variable, X, for each additional variable o 1. Y = bX + a (one predictor variable) o 2. Y = b1X1 + b2X2 + a (two predictor variables) o 3. Y = b1X1 + b2X2 + b3X3 + a (three predictor variables) Advantage of including multiple predictors in the regression equation is that we can detect the extent to which two or more predictor variables interact Intro to regression analysis (Ordinary least squares regression, OLS Regression) Regression analysis is the most commonly used multivariate technique to test causal models in behavioral; sciences (for experimental research mostly ANOVA) Used in situation where both the independent variable(s) (IV = predictor variable) and the dependent3 variable (DV) are measured on interval/ratio level Some central questions addressed by regression analysis: o How well could we predict the DV using the IV(s)? o What effects do(es) the IV(s) have on the DV? o Can we generalize out findings to the population? à test of significance Two main types: o Simple regression (bivariate): one IV, one DV § Similar to Pearson correlation r o Multiple regression (multivariate): more than one IV, one DV A measure for our model’s goodness-of-fit: a first look at R2 Once we defined the regression line, we will want to know how well our model actually fits the data For this, we look at a measure of goodness-of-fit: R2 R2 tells us what proportion of the variance of the DV can be explained by the IV(s) à how useful is the model for predicting the DV? R2 ranges between 0.0 (no prediction possible) and 1.0 (DV is completely determined by IV) o Regardless of what direction the slope is In blue or highlighted: what was said or emphasized by the lecturers Graphically: the closer the dots are to the line, the higher is R2 (means less error) On standardized and unstandardized coefficients While the unstandardized coefficient can take on any value, the standard coefficient (beta) is standardized to take on values between -1 and 1 (for simple regression) This standard coefficient is used to make statements on the strength and direction of the effects of an IV it is interpreted similar to Pearson’s In a simple regression, the standard coefficient is the Parsons’s r; but note that we are now talking about effects (size) and not merely correlations o Use same words used to describe Pearson r, except correlation à effect; and add “significant” on “insignificant” Making inferential statements with regression analysis To make inferential statements, two types of tests of significance are calculated as part of a regression analysis: 1. An F-test to test the significance for the model as a whole H0: in the population, R2 is zero; cannot be predicted by (fill labels for dependent Y and independent X) à in the population, the model is completely useless for predicting the DV 2. One t-test for each individual IV H0: in the population, the effect of the given IV is zero R2 revisited R2 is actually an indicator for how much better our predictions are if we use the predicted value of Y (based on our regression line) instead of just using the mean (the simplest model, a straight horizontal line at the mean) 1. How useful is the mean as a predictor? a. Look at the differences between observed and predicted values of Y (red dotted lines) b. We square those differences and add them up to get the total sum of squares (SST or SSY) c. The SST tells us how useful the mean is as a predictor 2. How useful is the regression line as a predictor? a. Look at the differences between observed and predicted values of Y (red dotted lines) In blue or highlighted: what was said or emphasized by the lecturers b. We square those variances differences to get the sum of squared residuals (SSR or SSresiduals) c. The SSR tells us how useful our regression model is as a predictor We were interested in the improvement in prediction when using our regression model instead of the mean This improvement is expressed in the difference between SST and SSR and is called the model sum of squares (SSM or SSregression) o SSM = SST – SSR SSM is the amount of variance in Y that can be explained by the model SST is the total amount of variance in Y The value of R2 then expresses the proportion of the total variance in DV that can be explained by the IV(s) UUj UUˆ‰Šˆ‰ff‹Œ UU 𝑅/ = UU‡ = UUŽ = 1 − UU ‡ R2 of 0 thus means that none of the total variance in the DV can be explained by the IV(s) R2 of 0.5 would mean that 50% of the total variance in the DV can be explained by the IV(s) R2 of 1.0 would mean that all of the total variance in the DV can be explained by the IV(s) In blue or highlighted: what was said or emphasized by the lecturers From simple to multiple regression analysis As we have already seen, in multiple regression, we have more than one IV We therefore also have more than one o Standardized coefficient o Unstandardized coefficient o T-test This also means that our regression equation gets slightly more complex ˆ𝑌 = 𝑏’ + 𝑏C 𝑋C + 𝑏/ 𝑋/ + ⋯ + 𝑏w 𝑋w Interpretation of intercept (constant in SPSS) If all IVs would be 0, we expect ˆY = constant (in this case 6.683) Interpretation of unstandardized coefficients: When age increases by one unit, we expect happiness to decrease by 0.016 (keeping the effect of trust and religiosity constant) When trust increases by one unit, we expect happiness to increase by 0.220 (keeping the effect of age and religiosity constant) When religiosity increases by one unit, we expects happiness to increase by 0.041 (keeping the effect of age and trust constant) Interpretation of multiple regression analysis in SPSS Looking at the standardized coefficients to determine the strength of the effects, we conclude that trust has the strongest effect: 0.269 Age has a slightly smaller, negative effect (-0,148), suggesting that the older you get, the less happy you become (but now it’s stronger in absolute value than religiosity) For all the three IVs, the t-test produce p-values below 0.05, we can thus reject all of their H0s (that the respective effect is 0 in the population), i.e. all three effects are statistically significant Interaction effects When we have more than one IV, we might encounter interaction effects In multiple regression, we thus distinguish between main effect (effects of IVs on DV) and interaction effects Interaction effect = the type/strength of the effect of one IV has on the DV is dependent on another IV In blue or highlighted: what was said or emphasized by the lecturers Green arrows = both IVs have a main effect on the DV Blue arrow = interaction effect: o The effect that playing a distressing video game has on aggressive behavior might be stronger for those who have a general aggressive disposition o In other words: if you are the aggressive type, playing a distressing video game may add more to your aggressiveness (as it resonates with existing patterns of behavior). If you are the peace-loving type, playing a distressing video fame might only have a marginal (or negative) effect In blue or highlighted: what was said or emphasized by the lecturers Chapter 17 – Nonparametric tests: Chi-square tests ordinal/nominal level data à non-parametric tests o Parametric tests = hypothesis tests that are used to test hypotheses about parameters in a population in which the data are normally distributed and measures on an interval or ratio scale o Nonparametric tests = hypothesis tests that are used (1) to test hypotheses that do not make inferences about parameters in a population, (2) to test hypotheses about data that can have any type of distribution, and (3) to analyze data on a nominal or ordinal scale of measurement § E.g.: a research in which we count the number of participants or items in two or more categories § The variance can only meaningfully convey differences when data are measured on a scale in which the distance that scores deviate from their mean is meaningful à does not happen on nominal or ordinal scales § Nonparametric tests do not require that the data in the population be normally distributed 2 Chi-square (x ) test = statistical procedure used to test hypotheses about the discrepancy between the observed and expected frequencies for the levels of a single categorical variable or two categorical variables observed together Chi-square goodness-of-fit-test Chi-Square goodness-of-fit-test = used to determine whether observed frequencies at each level of one categorical variable are similar to or expected at each level of the categorical variable o used to test the H0 that an ordinal/nominal variable has a certain distribution in the population (e.g. equal amount of men and women in the population, or 10% men/90% women in the population) The frequency observed (fo) is the count or frequency of participants recorded in each category or at each level of the categorical variable. The frequency expected (fe) is the count or frequency of participants in each category, or at each level of the categorical variable, as determined by the proportion expected in each category o Multiply the total number of participants (N) by the proportion expected in each category (p) à fe = Np H0 = in the population, (state observed frequencies) 80% are satisfied, 10% are not satisfied and 10% have no opinion H1 = in the population, it is not true that (state observed frequencies) 80% are satisfied, 10% are not satisfied and 10% have no opinion The test statistic measures the size of the discrepancy between an observed and expected frequency at each level of a categorical variable. The larger the difference between the empirically observed frequencies fo (from sample) and the expected frequencies fe (that we would expect if H0 is true), the smaller the probability the H0 is true In blue or highlighted: what was said or emphasized by the lecturers ([‹ F[‰ )J o 𝑋]^_ / = Σ [‰ The degrees of freedom for each chi-square distribution are equal to the number of levels of the categorical level (k) minus 1 à df = k – 1 The critical values of a chi-square test increases as the number of levels of the categorical variable (k) increases (as the df increases, the critical value also increases) Chi-square distribution = positively skewed-distribution of chi-square values for all possible samples when the H0 is true. The rejection region is always placed in the upper tail of the positively skewed chi-square distribution Hypothesis testing for goodness of fit We compare the value of the test statistic with the critical value. If the test statistic falls beyond the critical value, then we reject the null hypothesis Conclusion (example if retaining H0): a chi-square goodness-of-fit test showed that the frequency of..... was similar to what was expected (state numerical results of chi-square and p). Interpreting the chi-square goodness-of-fit test Interpreting a significant chi-square goodness-of-fit test o Compare observed and expected frequencies at each level of the categorical variable (k comparisons), because this is how the test statistic measured the discrepancies o It cannot be interpreted in terms of differences between categories Using the chi-square goodness-of-fit test to support the null hypothesis A decision to retain the H0 is the goal of the hypothesis test There’s no reason to think H0 will not be true, based on previous research Independent observations and expected frequency size The observed frequencies are recorded independently, meaning that each observed frequency must come from different and unrelated participants The Chi-Square test for independence Chi-Square test for independence = used to determine whether frequencies observed at the combination of levels of two categorical variables are similar to frequencies expected; used to test the H0 that two ordinal/nominal variables are not related to each other in the population (e.g. religion and nationality are not related) H0 = there is no relationship between the two variables in the population H1 = there is a relationship between the two variables in the population To find the expected frequencies if the row and column totals are equal, we divide the total number of participants observed (N) by the number of cells to find the expected frequency in each cell. The expected frequencies in each cell will also be equal To find the expected frequencies if the row and column totals are not equal: In blue or highlighted: what was said or emphasized by the lecturers 𝑟𝑜𝑤 𝑡𝑜𝑡𝑎𝑙 𝑥 𝑐𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙 𝑓b = 𝑁 The test statistic is the same as the one for a chi-square goodness-of-fit test Degrees of freedom à df = (number of rows – 1) x (number of columns – 1) Hypothesis testing for independence We compare the value of the test statistic to the critical value. If the test statistic falls beyond the critical value, then we reject the null hypothesis; otherwise, we retain the null hypothesis If X2obtained < X2critical à retain H0 / If X2obtained > X2critical à reject H0 If p-value of Chi-Square is smaller than alpha à reject H0 Conclusion (example if rejecting H0): A chi-square test for independence showed a significant relationship between...; (state numerical results of chi- square and p). The data indicate... is associated with... Chi-square (both) Steps in significance testing: 1. Formulate H0 and H1 a. For both Chi-Square tests, H1 is always non-directional b. H0 = no relationship between two categories c. H1 = there is a relationship 2. Calculate test statistic a. In order to calculate Chi Square, we need to construct a fe table and ([‹F[‰ )J apply the results to the formula 𝑥 / = Σ [‰ 3. Find appropriate critical value (given alpha and df) a. For the Chi Square test for independence, the df is: df = (number of rows – 1) x (number of columns – 1) b. For the Chi Square goodness-pf-fit test, the df is: df = k – 1 4. Compar