Field Experiments - Digital and Social Media Strategies - PDF
Document Details
Uploaded by GleefulNaïveArt
Tilburg University
Lachlan Deer
Tags
Summary
This document is a presentation on field experiments, a marketing research methodology, with an application example of testing promotional strategies for a new burger. It discusses the importance of control and random allocation. The document also covers hypothesis testing through analyzing data in experiments, the significance of properly measuring the data structure, and using graphs to visualize findings.
Full Transcript
Descriptive Regression Hotel The Differences Intermezzo: What Most License EightData have Steps Search Comparing Causal Potential Gold Field "True" The Learning Where Our Steps & Descriptive Burger Expedia Standard Differences A Comparing Wh...
Descriptive Regression Hotel The Differences Intermezzo: What Most License EightData have Steps Search Comparing Causal Potential Gold Field "True" The Learning Where Our Steps & Descriptive Burger Expedia Standard Differences A Comparing What Recap Tips Equivalents Statistics Experiments Citation we of in Type & Means Experiments Issues Statistical learning Business on could to Are discovered an Booking Experiments Data Rankings Experiment IExperimental Interpretations Promotion Fail and Statistics Driven Hotel Presenting Analyze Goals We Data Proportions Type Encountered Proportions be in Test Rankingon andsoExperiments Marketing Driven journey... Question(s) Now? Booking the for thisBy Expedia IIfar? Problem: Problem errors Treatment mechanism? Findings WeekMarketing Experiment Redux Redux ProportionsSet up Data #. Write A tibble: 15 ×2Hotels 14 Suggested Main.#Lurking Ato Point tibble: down Citation: of an aexperiment: testable ×does2"about hypothesis. test i = some = idea #Access (i) Regression Previously: How Form Focus De Well Know How Field to null on ne go estimates to No. information sessions your and the from: Is Marketing What best Why What term click experimentation Variables: designed alternative audience from Booked visualize behaviour descriptive experiments with is the analysis Viewed variable Attribution eld" β0i + store Expedia's a "right" booking. different hypotheses: experiment β represents that differencesof statisticsexperiments β10Expedia + β1 Expedia characteristics: is deliver not statistical Rules search How thebetween in Algorithm included conjunction causal do &do ranking the Media you the have analysistreatments? Algorithm as proportion effects want causal + εi + εi an of characteristics Mix do to better?interpretations explanatory two Modelling ofknow? run? marketing of prop_…³ bookings or differ? "Traditional" srch_id nrow(search_data) date_time Data Driven prop_id prop_coun…¹ treat…² booked clicked prop_…⁴ # treatment Groups: Generally advocate treatment hotels_viewed a "no change"Gold hypothesis Standard Data Driven Marketing Key Course.not response features methodological Identify interventions Summarize @misc{deerdsms2024, Themes: proven of variable and athe Marketing something eldexperiment explain in strategies: experiment: thethe analysis between keyexperimentation but can features treatments? ofaffect eld and the interpretation eldwork. experiments of Click Why?Call: Three. Decide treatment behaviour Market 1Warning:identi What on 61on two size: is able the The → or number Small 2013-06-22 more n aspects: impact statistic / of treatments Medium of 16517 clicks anis that marketing / Large based might 219 on impact intervention a difference 1 are the outcome 0 (X)or on variable(s) 0an ratio; ofdefault, 3by4.5 Things. lm(formulaRelies Be Two-sample H0: Does 158269 explicit you 1relationships title={"Digital.2interest. Discuss Provide 1 61 p − their might pobservational on how context = 2013-06-22 ranking tests the want between = and 0 business of to eld experiments as.numeric(booking_made_1) hotels_viewed means algorithm know: data variables. Social ~317671.11 treatment,Media questionoutperform Strategies: allow ~ treatment, data you marketers Field random trying to ordering? Experiments"}, to make answer causal 219= explanatory sessions_data) data 1 = sessions_data) 0 1 difference based statistics, the 0variable 0 is 2 3.5 subtracted i ⟹Core most Age idea outcome of experiments ofstore: author={Lachlan..321Counterfactual Build HA: Limit Eliminating an Experiments: (Y)in understanding to binary "fail" years Deer},since opening "lurking comparisons variables" of the Anchor data and on a structure sampledecision that selection needsbias to be #Anchor (ii) Measuring 0 claims Lead Other Also pon ideas61 with Generally, −data about likelyp0 called Advertising valid ≠4510 that 2013-06-22 ainclude a 0confounding outcomes isor recommendation their too 1.09 aavailable 39799 - marketing control Effects compare variable treatment to 219 an intervention where alternative 1 nothing 0promotion is/changed 0 to ratio2 3.0 use as abased st order "1" - "0", divided in the order "1"consumer "0" for 0 1 Residuals: year={2024}, Expedia ran an experiment where they compare behaviour Today: to search_data something 42 How Intro many 61 %>% to 1 we observations Field 2013-06-22 can 1776 select(srch_id) Experiments measure 68629 %>% in total?... made n_distinct() 219 1 0 0 2 2.0..i.e. Compute Two-sample "As urlgood change as some random" = Attribution "https: does test not descriptive lead of proportions assignment Models github.com/deer to an statistics to treatments marketing improvement lab/dsms - lurking lecturevariables won't experiments" Hypothetical Scenario: To MinMin baseline specify 1Q 1Q Median this Median order 3Q 3Q Max yourself, Max supply You Compute Support.5Warning: Authenticity can track: with descriptive The analysis statistic of0.05787 treatments statistics is thatbased and on visualize 1`order a difference data = or c("1", from 0 a "0")`. ratio; eldby default, Field Experiments Form }Set -0.94213 -1.1081 Finds from null How 8a steps trouble...6Compute level aWe Visualize Comparison 61 2013-06-22 the and many -0.1081 of -0.15766 61us purpose existing alternative in don't signi the search -0.0882 the offor have Data 73060 algorithm hypotheses: experimental cance:sessions -0.0882 data outcomes 0.05787 proportions α 9.9118 = inmethod 0.84234 between 0.05 Finds in 219 ranks total? hotels data our treatment datafor1 and for a 0 given a purpose 0control search 3 3.0 to Analyzing FieldPreliminaries This..difference 6286 #How...7Randomly No order Field Experiments Recap experiment Use Conceptual: AsampleExperiment graphs Representativeness is a good behaviour tibble: many 61 "1" how Media 101 Data many based 2013-06-22 thing when 2013-06-22 Mix What 1subjects hotels selection - divide "0", subjects the × 6or are of are bias 78003to Models statistics, hotels 107306 they? in... divided include participants were the analyst (people/stores) Whythe data? in in 219 we randomly 219 picked the the run Across into experiment**. explanatory the order groups. them? ranked 1 how sample"1" variable / How and 0many toper to 0 is subtracted analyse displayed match "0" countries? 0 the for 3 4.5 3 3.5 group ratio the based st i DigitalThe #Test and Burger..8To (iii) search_data Weekly Run ANOVA Coefficients: Starts H0: Sample Real Use Check data? μ Promotion Assignment Hard Social Media from Analyzing they BadHow Strategies Problem Statistic: to (the statistic 0 specify care Also ideas − world 61 many %>% sales evaluate what μ statistical the about of correct) Incrementality = Selection data Estimate need fail (in context (graph)is 0subjects tochisq_df 2013-06-22 1 this search quickly hundreds known from Std. decide select(prop_id) Bias: hypothesis orderlabels! 123254 Error an sessions on is statistical the %>% to Failingof Experiments groups thousands texperiment p_value yourself, value sample test in Start to testing ensure is alternative 219 Pr(>|t|) n_distinct() thesize from to supply done ofwhat that analyze involves treatment for each 1 dollars) through is the data lower_ci `order and group. 0 a unknown sample randomization from descriptive store = control c("1", obtained a upper_ci 0 eld statistics, 3 "0")`. groups? is 4.5..9 For device Need HA: Use Linear User Expedia (Intercept). μup representative Relevant to the 61 − to compute μ 4 0.157658weeks Regression results 2013-06-22 Generated has 1.10811≠ outcomeprovided 0 of to counterfactuals inform 124295 Content 0.006570 0.01190 the a measures & dataset population decision 93.127Social 24.0 that 219 % 2013-06-25 are treatments Natural to your (assuming advocates organised are the experiments proposed for not unequaloutput? 5439 have around in the normally transparency select(prop_country_id) effects social solution what √ variances): and %>% a 219 set sciences: one to of + “search measure sees accessibility n_distinct() 1 A design-based... in the 0 approach. result research 0 Cambridge: impressions”, effectiveness and teaching 2of3.5 to the all Updated: Marketing 1. How 14 adapted Residual1ManagersOutcomes would 81 from standard care 2024-09-11 1.30 you must 2780. 2013-06-25 De about Lange error: be allocate chosen sales 0.193 5449 and 0.2769 0.5014 in promotion Putoni on advance! two.sided 6284 1 n strategies 219 (2020) degrees 2 of n 1 freedom 0.0199 to stores? 0 -0.0100 0 0.0498 0 4.5 In 15 most Cambridge levels. In No three of Compare Control How Same eld University society experiments, have as promotional 81 and responses Group: ANOVA Press. 2013-06-25thus eld via Effectiveness or experiments a participants a t-test strategies creating more (correct) 6997 compared helped accountability statistical 219 aretest.notusand to 1 even doimpact. nocausal conscious intervention 0 data 0 that driven (eg. they 3 4.5 Multiple No R-squared: access to 0.6194, 0.000318, previous sales Adjusted Adjusted data R-squared: R-squared: 0.6194 0.0001589 Estimate 1publications, i.e. 5part theamore or ordered magnitude list of6284of the hotels effect that sizes a− ^ (and user sees ^ standard after they search errors) for a are 2. #.How How promotional taking … with F-statistic: Update Conclude When to would talking your whether you to instrategy) 1.023e+04 1.999 interpret an on plan a experiment variables: 1on the to evaluate rm and1output? using reject and or 6284 DF, what the "failmarketing? prop_brand_bool DF, we p μ ptovalue: havevalue: reject" effectiveness 1 0.1575 μ , < 22.2e-16 discussed. your of prop_location_score1 hypothesis each based on (7). promotion , strategy? InWhen experiments done at large tech companies t-stat theusually = # hotelcorrectly: price_usd , isolates promotion_flag incremental , position effect, of theand treatment abbreviatedon a β = 0.8 Which on our the t-test Expedia and ANOVA website. didn't Which promotion strategy is+the ³most effective? 2 2 # tovariable How interpretnames the output? ¹prop_country_id, 1 ²treatment, σ^ 2 σ^ prop_starrating, Remark: outcome Remark: (Two# blank GoodThese variable experiments ⁴prop_review_score problems also have a hypothesis slides to sketch out an answer) plagued that√( answers 1 our interpretations n a strategic 2 n ) question for aof attribution business This work isDoes licensed their under rankinga Creative Commons algorithm Attribution-ShareAlike outperform random 4.0 International ordering? models and media mix models License.along with the observation/measurement of an outcome variable 40 20 30 44 46 48 49 50 10 24 26 28 29 34 36 38 39 42 43 45 47 14 16 18 19 22 23 25 27 32 33 35 37 12 13 15 17 21 31 41 11 51 47532 / 51 6 8 9 Learning Goals for this Week De ne the term " eld" experiment Identify and explain the key features of eld experiments Discuss how eld experiments allow marketers to make causal claims about their marketing intervention Compute descriptive statistics and visualize data from a eld experiment Use statistical hypothesis testing to analyze data from a eld experiment to answer a strategic question of relevance to managers 2 / 51 Preliminaries 3 / 51 Where Are We Now? Course Themes:. Measuring Advertising Effects Attribution Models Media Mix Models Incrementality Experiments. User Generated Content & Social Media 4 / 51 Our learning journey... Previously: Marketing Attribution Rules & Media Mix Modelling Relies on observational data Today: Intro to Field Experiments Conceptual: What are they? Why we run them? How to analyse the data? Application: Search Ranking Experiment at Expedia Next Week: Experiments in Online Display Advertising & Search Engine Marketing 5 / 51 Hypothetical Scenario: The Burger Promotion Problem 6 / 51 The Burger Promotion Problem: Set up New burger to be introduced to all stores Three different promotion strategies used (promo 1, 2 or 3) Managers care about sales No access to previous sales data Which promotion strategy is the most effective? 7 / 51 The Burger Promotion Problem Access to information about store characteristics: Market size: Small / Medium / Large Age of store: in years since opening You can track: Weekly sales (in hundreds of thousands of dollars) per store For up to 4 weeks Discussion Questions: 1. How would you allocate promotion strategies to stores? 2. How would you evaluate the effectiveness of each promotion strategy? (Two blank slides to sketch out an answer) 8 / 51 9 / 51 10 / 51 Potential Issues Encountered 11 / 51 Potential Issues Encountered Lurking Variables: variable that is not included as an explanatory or response variable in the analysis but can affect the interpretation of relationships between variables. Also called a confounding variable Sample Selection Bias: Failing to ensure that the sample obtained is representative of the population intended to be analyzed No Control Group: Effectiveness compared to no intervention (eg. promotional strategy) Remark: These problems also plagued our interpretations of attribution models and media mix models 12 / 51 Gold Standard Data Driven Marketing "Traditional" Data Driven Gold Standard Data Driven Marketing Marketing Anchor on a decision that needs to be Anchor on data that is available made Finds a purpose for data Finds data for a purpose Starts from what is known Start from what is unknown Empowers data Empowers decision making analysts/scientists adapted from De Lange and Putoni (2020) 13 / 51 Field Experiments 101 14 / 51 Causal Data Driven Marketing What is the impact of an marketing intervention (X) on an outcome (Y). Hard to evaluate. Need to compute counterfactuals. Challenge: same person cannot both get treatment and not get treatment 15 / 51 Field Experiments Field experimentation represents the conjunction of two methodological strategies: experimentation and eldwork. Core idea of Experiments: along with the observation/measurement of an outcome variable 16 / 51 Field Experiments Key features of a eld experiment:. Authenticity of treatments. Representativeness of participants. Real world context. Relevant outcome measures In most eld experiments, participants are not even conscious that they are taking part in an experiment 17 / 51 "True" Experiments Three identi able aspects:. Comparison of outcomes between treatment and control. Assignment of subjects is to groups is done through a randomization device. Manipulation of treatment is under control of a researcher / analyst Dunning, T. (2012). Natural experiments in the social sciences: A design-based approach. Cambridge: Cambridge University Press. When done correctly: isolates the incremental effect of the treatment on a outcome variable 18 / 51 Eight Steps of an Experiment. Write down a testable hypothesis. Generally advocate a "no change" hypothesis. Decide on two or more treatments that might impact the outcome variable(s) of interest. Generally, include a control treatment where nothing is changed to use as a baseline. Compute how many subjects to include in the experiment**.. Randomly divide subjects (people/stores) into groups. Also need to decide on the sample size for each group.. Expose each group to a different treatment.. Measure the response in terms of an outcome variable(s) for subjects in each group. Outcomes must be chosen in advance!. Compare responses via a (correct) statistical test.. Conclude whether to reject or "fail to reject" your hypothesis based on (7). Remark: Good experiments have a hypothesis that answers a strategic question for a business 19 / 51 The Burger Promotion Problem Redux Go back to your proposed solution to measure the effectiveness of the three promotional strategies Update your plan using what we have discussed. 20 / 51 21 / 51 Causal Data Driven Marketing Redux What is the impact of an marketing intervention (X) on an outcome (Y). Hard to evaluate. Need to compute counterfactuals. Challenge: same person cannot both get treatment and not get treatment How have eld experiments helped us do causal data driven marketing? 22 / 51 Analyzing Field Experiment Data 23 / 51 The Expedia Hotel Ranking Experiment Expedia ran an experiment where they compare consumer behaviour from the existing algorithm that ranks hotels for a given search to behaviour when the hotels were randomly ranked and displayed Expedia has provided a dataset that includes shopping and purchase data as well as information on price competitiveness. The data are organised around a set of “search result impressions”, i.e. the ordered list of hotels that a user sees after they search for a hotel on the Expedia website. Does their ranking algorithm outperform random ordering? 24 / 51 Hotel Search & Rankings on Expedia 25 / 51 The Data # A tibble: 15 × 14 srch_id date_time prop_id prop_coun…¹ treat…² booked clicked prop_…³ prop_…⁴ 1 61 2013-06-22 16517 219 1 0 0 3 4.5 2 61 2013-06-22 31767 219 1 0 0 2 3.5 3 61 2013-06-22 39799 219 1 0 0 2 3.0 4 61 2013-06-22 68629 219 1 0 0 2 2.0 5 61 2013-06-22 73060 219 1 0 0 3 3.0 6 61 2013-06-22 78003 219 1 0 0 3 4.5 7 61 2013-06-22 107306 219 1 0 0 3 3.5 8 61 2013-06-22 123254 219 1 0 0 3 4.5 9 61 2013-06-22 124295 219 1 0 0 3 3.5 10 61 2013-06-22 124573 219 1 0 1 3 4.0 11 61 2013-06-22 131016 219 1 0 0 2 4.0 12 61 2013-06-22 138014 219 1 0 0 3 4.0 13 81 2013-06-25 5439 219 1 0 0 2 3.5 14 81 2013-06-25 5449 219 1 0 0 0 4.5 15 81 2013-06-25 6997 219 1 0 0 3 4.5 # … with 5 more variables: prop_brand_bool , prop_location_score1 , # price_usd , promotion_flag , position , and abbreviated # variable names ¹prop_country_id, ²treatment, ³prop_starrating, # ⁴prop_review_score 26 / 51 Steps to Analyze Experimental Data. Be explicit on the business question you are trying to answer. Build an understanding of the data structure. Compute some descriptive statistics. Visualize the Data. Run (the correct) statistical test. Use the results to inform decision making 27 / 51 The Business Question(s) How to go from: Does their ranking algorithm outperform random ordering? to something we can measure... 28 / 51 Descriptive Statistics What descriptive statistics do you want to know? 29 / 51 Descriptive Statistics What descriptive statistics do you want to know? Things you might want to know: How many observations in total? How many search sessions in total? How many hotels are in the data? Across how many countries? How many search sessions in the treatment and control groups? There are de nitely more... 30 / 51 Descriptive Statistics # (i) nrow(search_data) 158269 # (ii) search_data %>% select(srch_id) %>% n_distinct() 6286 # (iii) search_data %>% select(prop_id) %>% n_distinct() 29375 # (iv) search_data %>% select(prop_country_id) %>% n_distinct() 1 31 / 51 Descriptive Statistics By Treatment # A tibble: 2 × 2 # Groups: treatment treatment n 1 0 4510 2 1 1776 32 / 51 Differences in Booking Proportions How to best visualize differences in the proportion of bookings between treatments? 33 / 51 Differences in Booking Proportions 34 / 51 A Statistical Test What is the "right" statistical analysis to run?. Two-sample tests of means Limit to binary comparisons. Two-sample test of proportions We don't have proportions in our data. ANOVA. Linear Regression 35 / 51 Comparing Proportions Form null and alternative hypotheses: H0: p0 − p1 = 0 HA: p0 − p1 ≠ 0 Set a level of signi cance: α = 0.05 Test Statistic: ^ p ^ − p 1 0 z = p̄ q̄ p̄ q̄ √ + n1 n2 36 / 51 Comparing Proportions Warning: The statistic is based on a difference or ratio; by default, difference based statistics, the explanatory variable is subtracted i order "1" - "0", or divided in the order "1" / "0" for ratio based st To specify this order yourself, supply `order = c("1", "0")`. # A tibble: 1 × 6 statistic chisq_df p_value alternative lower_ci upper_ci 1 3890. 1 0 two.sided 0.766 0.803 How to interpret the output? 37 / 51 Intermezzo: Type I and Type II errors In experiments at large tech companies usually β = 0.8 38 / 51 Comparing Means Is click behaviour different between treatments? Click behaviour → number of clicks Other ideas likely valid too Form null and alternative hypotheses: H0: μ0 − μ1 = 0 HA: μ0 − μ1 ≠ 0 Set a level of signi cance: α = 0.05 Test Statistic (assuming unequal variances): μ ^ − μ ^ 1 2 t-stat = 2 2 σ ^1 σ ^2 √( + ) n1 n2 39 / 51 Comparing Means # A tibble: 2 × 2 treatment hotels_viewed 1 1 1.11 2 0 1.09 Warning: The statistic is based on a difference or ratio; by default, difference based statistics, the explanatory variable is subtracted i order "1" - "0", or divided in the order "1" / "0" for ratio based st To specify this order yourself, supply `order = c("1", "0")`. # A tibble: 1 × 7 statistic t_df p_value alternative estimate lower_ci upper_ci 1 1.30 2780. 0.193 two.sided 0.0199 -0.0100 0.0498 How to interpret the output? 40 / 51 What have we discovered so far? 41 / 51 Regression Equivalents Bookedi = β0 + β1 Expedia Algorithm + ε i Call: lm(formula = as.numeric(booking_made_1) ~ treatment, data = sessions_data) Residuals: Min 1Q Median 3Q Max -0.94213 -0.15766 0.05787 0.05787 0.84234 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.157658 0.006570 24.0