Data Analysis for Marketing Decisions - Lecture Notes PDF
Document Details
Uploaded by AwesomeCarnelian4810
Copenhagen Business School
Georgios Halkias
Tags
Related
- Tema 3: Introducción al Contraste de Hipótesis PDF
- HPS3U34 Méthodes et Statistiques pour Psychologues PDF
- Statistical Inference – Comparing Two Means PDF
- Statistical Inference – Population Proportion PDF
- Hypothesis Testing, P Values and Statistical Inference MD115 PDF
- Data Analysis for Marketing Decisions (PDF)
Summary
These lecture notes provide an overview of data analysis for marketing decisions. They discuss statistical inference, hypothesis testing, and the role of null and alternative hypotheses. The presentation features examples and visualizations related to the concepts.
Full Transcript
Data Analysis for Marketing Decisions Session 3: Statistical inference II: NHST and all that jazz… Priv.-Doz. Dr. Georgios Halkias Associate Professor, Department of Marketing | Copenha...
Data Analysis for Marketing Decisions Session 3: Statistical inference II: NHST and all that jazz… Priv.-Doz. Dr. Georgios Halkias Associate Professor, Department of Marketing | Copenhagen Business School Adjunct Professor, Department of Marketing and International Business | University of Vienna E: [email protected] Priv.-Doz. Dr. Georgios Halkias © 1/29 What is a “hypothesis”? Hypothesis is a prediction about the state of the world. It is a scientific statement that must be able to be empirically disproved, i.e., be falsifiable –testable and able to be disconfirmed based on evidence. …translated into relationships between variables that can be empirically measured (in a valid and reliable manner). Hypothesis: Being in a bad mood makes people spend more money. Independent variable (predictor variable) → mood (good/bad) Dependent variable (outcome/criterion variable) → money spending Which of the following statements represents a hypothesis and which one doesn’t? ► Small and large companies evoke different levels of consumer trust. ► Psychotherapy leads to improved well-being. ► Most people who commit suicide they regret doing so. ► If one had studied medicine, they would make more money. ► Dreaming duration for males is longer than that for females. Priv.-Doz. Dr. Georgios Halkias © 2/29 Types of hypotheses Directional hypotheses relates to 1-tail testing The researcher indicates a priori the direction (either positive or negative) of the expected relationship. e.g., Global brands evoke higher perception of quality than local brands. Advertising creativity increases consumer attitudes. Non-directional (exploratory) hypotheses relates to 2-tail testing The researcher expects an effect but has no a priori expectation about the direction of the effect. e.g., Global and local brands evoke different perception of quality. Advertising creativity influences consumer attitudes. Perceived ≠ Perceived quality of global quality of local brands < brands > Perceive H1 (+) H2 (+) Quality Willingness brand perception to pay globalness Priv.-Doz. Dr. Georgios Halkias © 3/29 Types of hypotheses (…they come in pairs) Alternative hypothesis (H1): Our predictions/expectations of how things in the real world are. Usually, that there is an effect (e.g., a difference or a relationship) in the population. …each alternative hypothesis has a corresponding Null hypothesis (H0) which is the opposite of (it “nullifies”) H1 and usually states that no effect exists. The null (H0) together with the alternative hypothesis (H1) account for all potential outcomes regarding the relationship being studied. Example: H1: Heavy metal fans have above average IQ. H0: Heavy metal fans do not have above average IQ. Priv.-Doz. Dr. Georgios Halkias © 4/29 Types of hypotheses (…they come in pairs, but why?) We NEVER prove the alternative hypothesis using statistical testing, we ONLY collect evidence against the H0! Rejecting H0, doesn’t prove H1 (it merely maintains it). Failing to reject H0, doesn’t prove H0 (it merely maintains it). NHST considers the chances of observing our sample data (results), assuming that the null hypothesis is true. ► How likely is it to find 20 (out of 100) metalheads with IQ above average, if…? ► How likely is it to find 95 (out of 100) metalheads with IQ above average, if…? Hypothesis testing is based on the probability (p-value) of obtaining such sample data or more extreme if, hypothetically speaking, the null is true. Priv.-Doz. Dr. Georgios Halkias © 5 5/29 1 Process behind NHST 2 Practical example 3 …and all that jazz Priv.-Doz. Dr. Georgios Halkias © 6 6/29 The process behind NHST ► Formulate the alternative (H1: there is some effect) and the null (Ho: there is no effect) hypothesis. ► Model H (in the form of a test-statistic) to later see how well it fits the data. ► Specify acceptable risk/error rate or significance level (α = 1 – confidence level). ► To determine how well the hypothesized model fits the data, calculate the test statistic and use its probability distribution to find the probability (p-value) of getting this “model” (test-statistic) or more extreme, assuming that the null hypothesis were true. ► If the probability associated with the test statistic (p-value) is less than the significance level (α), reject Ho in favor of H1 (effect is statistically significant). 7 Test statistic A numerical summary of the dataset that “models” the expected effect (hypothesis). Defined by a formula/equation that depends on the statistical test applied. z-test (one-sample) t-test (two-sample) the probability distribution ANOVA of test statistics can become known. Thus, we can calculate how frequently Chi-square test different results occur Priv.-Doz. Dr. Georgios Halkias © 9 9/29 Type I and II error No statistical test is certain. There is always chance of drawing incorrect conclusions. Reality in the population H0 true H0 false (there is no effect) (there is an effect) (based on sample) H0 fail to reject Correct Type II error Decision (no effect is found) (1-α) (β) H0 reject Type I error Correct (effect is found) (α) (1-β) α → the likelihood of making Type I error (false positive – find effect that does not exist) β → the likelihood of making Type II error (false negative – don’t find an effect that exists) Priv.-Doz. Dr. Georgios Halkias © 1010/29 Type I and II error Falsely reject the H0 Falsely accept* the H0 * fail to reject Priv.-Doz. Dr. Georgios Halkias © 11 11/29 Significance level (α, alpha) Would you put an innocent person in jail or fail to sentence Reality in the population a guilty one? H0 (Innocent) H0 (Guilty) no effect effect (based on sample) Type II error H0 fail to reject Correct (β) Decision (find innocent – no effect) (1-α) False Negative Type I error H0 reject Correct (α) (find guilty - effect) (1-β) False Positive ▪ The maximum risk we are willing to take to reject a true null hypothesis (Type I error) is known as Significance Level (α, alpha). ▪ The probability of our results (“model” or test statistic) is contrasted against the significance level (α-level –max. acceptable likelihood of Type I) to determine statistical significance. ▪ Usual (yet, arbitrary) levels: α =.05 (5%, minimum), =.01 (1%, strong) and =.001 (0,1%, stronger) Priv.-Doz. Dr. Georgios Halkias © 1212/29 Test statistic, critical value, and p-value (…the probability distribution of test statistics under the null hypothesis is known. Therefore, we can calculate how frequently different results occur). The probability associated with obtaining a test statistic (or bigger) is called p-value and shows how likely is it to get a test statistic at least as big as the one observed, if the null hypothesis is true (there is no effect). ► Assuming no effect, it is very unlikely (p ≤ α) that I would get these results (test statistic). I reject the null → there must be an effect: statistically significant results ► Assuming no effect, it is likely (p > α) that I would get these results (test statistic). I cannot reject the null → there seems to be no effect: statistically non-significant results Statistical significance is determined: p-value vs. α-level test statistic vs. critical value (of the test statistic) Priv.-Doz. Dr. Georgios Halkias © 1414/29 Regions of rejection One-tail (directional) ഥ < μ or negative relationship H1: 𝒙 ഥ > μ or positive relationship H1: 𝒙 ഥ ≠ μ or some relationship H1: 𝒙 Two-tails (non-directional) If |statistic| > |critical| then p < α Accept H0 Typical α levels:.05 (5%),.01 (1%),.001 (99,9%) Critical values are determined by your Confidence/α level. Priv.-Doz. Dr. Georgios Halkias © 1616/29 1 Process behind NHST 2 Practical example 3 …and all that jazz Priv.-Doz. Dr. Georgios Halkias © 17 17/29 Example… I believe that on average customers spend more than €18 in restaurants. H1: Spending is higher than €18. H0: Spending is not higher than €18. Standard deviation of each sample: 19 18 20 19 22 18 21 18 Sample size: n = 3 22 19 Priv.-Doz. Dr. Georgios Halkias © 1818/29 Model H1 (using the appropriate test) Compare against a fixed value A z-Test for a sample mean indicates by how many standard errors 𝐱ത greater than 18 → z-Test does the sample mean and the hypothesized (population) mean differ H1: 𝐱ത >18 H0: 𝐱ത ≤18 μ-2σ μ-σ μ μ+σ μ+2σ Note: if 𝑥ҧ greater than 18, then z > 0 if 𝑥ҧ is equal to 18, then z = 0 if 𝑥ҧ smaller than 18, then z < 0 -2 -1 0 1 2 Priv.-Doz. Dr. Georgios Halkias © 1919/29 Set α (alpha)-level for the z-test For 5% α (and 95% confidence) the critical value of a z-test is zcritical = 1.645 (1.96 for 2-tail) Priv.-Doz. Dr. Georgios Halkias © 2020/29 Set α (alpha)-level for the z-test Compare against a fixed value 𝐱ത greater than 18 → z-Test 1-tail (𝑥ҧ > μ): 95% of the z values lie below (left to) zcritical = 1.645 (spend more than 18) 2-tail (𝑥ҧ ≠ μ): 95% of z values lie within zcritical = +/- 1.96 (do not spend 18) zcritical = -1.96 (lower-tail) zcritical = +1.96 (upper-tail) We have a directional hypothesis, so 1-tail testing is appropriate. Using the 2-tail critical value would make us |zcritical| = 1.645 (one-tail) more strict (as if α = 2,5%) Priv.-Doz. Dr. Georgios Halkias © 2121/29 Test statistic (p-value) vs. critical value (α-level) Calculate test statistic and find its p-value software makes the comparison Compare with critical value for α-level automatically and gives you p for sig. DECISION Sample A z = 2.003 > 1.645 = zcritical p ≤.05 (α) (reject H0 →) H1 accepted Sample C z = 1.882 > 1.645 = zcritical p ≤.05 (α) (reject H0 →) H1 accepted Priv.-Doz. Dr. Georgios Halkias © 2222/29 Significance level and statistical significance → same test (z-Test), different research setting Sample C mean = 19.66 Z = 1.882 Sample A mean = 20.67 Z = 2.003 What If I want to be more confident (e.g., 97.5%)? What if I have a non-directional hypothesis (2-tail)? Zcritical = 1.96 Priv.-Doz. Dr. Georgios Halkias © 2323/29 Statistical significance and Confidence Intervals (CIs) Calculate the confidence intervals for the sample you drew (see previous session) and see if they include the H0. We'll use the critical value for 2-tail test (Za/2) and a 95% Confidence Level to be a bit more conservative. Note that this is similar to applying a 1-tail test with 97.5% Confidence Level. ▪ If they overlap with what H0 says then reject H1 (=fail to reject the Null) H1: 𝐱ത > 18 H0: 𝐱ത ≤ 18 Sample A DECISION μ=20.67 ± 1.96×(2.309/1.732) 95% CI (α/2=0.05) does not contain 18 18.06 < μ < 23.28 H1 accepted Sample C 95% CI (α/2=0.05) contains 18 μ=19.66 ± 1.96×(1.528/1.732) H1 rejected 17.93 < μ < 21.39 Priv.-Doz. Dr. Georgios Halkias © 2424/29 Statistical significance and Confidence Intervals (CIs): A note… CI with one-sample CI with two-samples α α 𝒍𝒆𝒏𝒈𝒕𝒉𝒂 𝒍𝒆𝒏𝒈𝒕𝒉𝒃 + 𝟐 𝟐 Moderate overlap ≈ 𝟐 (≤ 50%) Priv.-Doz. Dr. Georgios Halkias © 25 25/29 Inferential rules p-value statistic* Confidence Interval Decision (sig.) The (1-α)% Confidence Interval does Reject H0 |test| ≥ |testcritical| p≤α NOT include the H0 value. Accept H1 The (1-α)% Confidence Interval DOES Accept H0 |test| < |testcritical| p>α include the H0 value. Reject H1 * The test-statistic (and critical) values depend on the test applied (e.g., z, t, χ2, F….) Priv.-Doz. Dr. Georgios Halkias © 2626/29 1 Process behind NHST 2 Practical example 3 …and all that jazz Priv.-Doz. Dr. Georgios Halkias © 27 27/29 Statistical and substantive significance Statistical significance is not the same thing as actual importance or substantive significance. …small and unimportant effects can turn out to be statistically significant just because of huge samples, while large and important effects can be missed simply because of small samples (and, thus, a lot of sampling error)… The problem with the p-value is that it gives virtually no information about whether the results really matter… Hypothesis testing and statistical significance does not tell us anything about the importance or magnitude of an effect, the so-called Effect Size. Priv.-Doz. Dr. Georgios Halkias © 28/29 Effect size …assesses the magnitude of an observed effect. An effect size is a standardized measure of the size of an effect. → we can compare effect sizes across different studies that have measured different variables or have used different scales of measurement. There are several effect size measures, such as: Cohen’s d Pearson’s r ► r =.1, d =.2 (small effect) ► r =.3, d =.5 (medium effect) ► r =.5, d =.8 (large effect) Beware: the size of an effect should be placed within the research context! Priv.-Doz. Dr. Georgios Halkias © 29/29 Statistical power Power is the ability of a test to detect an effect of a particular size → statistical power is the probability that a test will find an effect, assuming that one exists in the population. This is the opposite of the probability that a given test will not find an effect, assuming that one exists in the population, i.e., Type II error (β-level). Therefore, power =1 − β Statistical power of (min.).80 (β =.20) is desirable ► Size of sample ► Effect size ► α-level ► Methods Priv.-Doz. Dr. Georgios Halkias © 31/29 Sample size and statistical significance CIs from samples with same means and SDs but different size… 50% off 1+1 50% off 1+1 …because of sampling error… Priv.-Doz. Dr. Georgios Halkias © 32/29 Type I (α) and II (β) errors, p-values, power & effect size Imagine that you find a statistically significant effect… However, there is always a chance you are making a mistake. How likely is it that this effect will occur again in the future? Future research might not be able to re-produce and corroborate results obtained from underpowered studies. * Note: z-test (1-tail, α = 5%). Graphs are on the same scale! Priv.-Doz. Dr. Georgios Halkias © 33/29 Beyond statistical significance (p-value)… …Would you make a decision based on something that is not likely to happen again? …What if your blood tests tell you that are OK and you do not need treatment (p <.05, statistical power of 50%)? Given the α and the effect size, as the sample size increases, the sampling distributions under the H0 and H1 change (get narrower) and statistical power increases! * Note: z-test (1-tail, α = 5%). Graphs are on the same scale! Priv.-Doz. Dr. Georgios Halkias © 34/29 ANY QUESTIONS?