2024 CMA Lecture 7&8_StudentsAfter - Lecture Notes PDF

Week 4 Lecture 7&8 Customer Marketing Analytics Predicting Customer Response Using Logistic Regression Jiska Eelen Associate Professor Marketing https://research.vu.nl/en/persons/jiska-eelen Agenda § Introduction to database marketing and data mining § Predicting customer response and RFM paradigm § Introduction to logistic regression 2 Classifying Marketing Research § By type of data - Quantitative research - Qualitative research § By research design - Exploratory research - Descriptive research - Causal research § By data source - Secondary data o Syndicated research - Primary data 3 Classifying Marketin § By type of data EXERCISE - Quantitative research - Qualitative research Data § By research design - Exploratory research Beer data - Descriptive research - Causal research Mensa Service Quality data § By data source Perceptual mapping – motorcycles data - Secondary data Cereal data o Syndicated research - Primary data LEFT RIGHT HAND HAND Quant Qual LEFT TWO RIGHT LEFT RIGHT HAND HANDS HAND HAND HAND Expl Descr Causal Secon Primary Classifying Marketing Research § By type of data - Quantitative research - Qualitative research § By research design - Exploratory research - Descriptive research - Causal research § By data source - Secondary data o Syndicated research - Primary data 5 Primary Data Data collected specifically to answer the question(s) posed by the current research objectives Types of primary data: § Demographic / Socioeconomic / Lifestyle characteristics § Attitudes / Opinions § Awareness / Knowledge § Motivation § Intentions and behaviour Collecting primary data: § "Communication”: Questioning respondents to secure the desired information § Observation: The situation of interest is watched and the relevant facts, actions, or behaviours recorded 6 Secondary Data Data previously collected for purposes other than the research at hand Market response models Internal sources: - Accounting records (e.g., sales invoices, marketing expenditures) - Customer transaction databases This lecture - Sales force reports - Operating records (e.g., warranty cards, customer complaint services) External sources such as market and industry research publishers, trade associations etc. 7 The Big Data Challenge Data ≠ Information 8 The Big Data Challenge Data is raw.. unorganized.. and not worth much when it comes to improving business. Information is what data becomes when you organize it, analyze it and give it some sense of structure. 9 Examples of Large Databases § Online transactions (e-commerce) - Amazon: 300 million customer accounts § Web browsing/click stream data § Purchases at department/grocery/convenience sores - Albert Heijn:16 million transactions per week - Consumer panel data (Kantar NIPObase, GfK): Purchases made by households across all retail outlets § Subscription data - Netflix: 200 million-plus subscribers 10 Predictive model for the “Perfect TV show” Netflix determined that the overlap of these three areas would make “House of Cards” a successful entry into original programming. 11 12 Why Data Mining? § Lots of data is being collected and warehoused § Computers have become cheaper and more powerful § Competitive pressure is strong - Provide better, customized services for an edge (e.g. in Customer Relationship Management) § There is often information “hidden” in the data that is not readily evident 13 Data Mining: 80% data and 20% mining § Data in the real world is dirty - Incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate data - Noisy: containing errors or outliers - Inconsistent: containing discrepancies in codes or names § No quality data, no quality mining results! - Quality decisions must be based on quality data - Data warehouse needs consistent integration of quality data 14 Major Tasks in Data Preprocessing § Data cleaning - Deal with missing values, smooth noisy data, identify or remove outliers, and resolve inconsistencies § Data integration - Integration of multiple databases or files § Data transformation - Normalization, aggregation, date or time transformations § Data reduction - Obtains reduced representation in volume but produces the same or similar analytical results 15 Forms of Data Preprocessing 16 What is Data Mining? (1) The efficient discovery of previously unknown, valid, potentially useful, understandable patterns in large datasets (2) The analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data in novel ways that are both understandable and useful to the data owner 17 Examples of Marketing Applications § Predicting customer response (e.g., which customers are most likely to be purchased in the future; likelihood that a customer will respond to marketing stimulus; likelihood of churn) § Market basket analysis (e.g., uncover which products are likely to be purchased together) § Panel data analysis (tracking buying behavior over time across all retail outlets in a market) § What are my customers doing on my website – web mining (clickstream analytics) 18 Agenda § Introduction to database marketing and data mining § Predicting customer response and RFM paradigm § Introduction to logistic regression 19 Predicting Customer Response and Recency, Frequency, and Monetary (RFM) Approach: The Case of CDNow 20 Objectives § Can past purchase behaviours predict future purchase behaviours? § Summarizing (past) customer transaction behaviour § Identifying target customers 21 Master Dataset § Purchase history (Jan 1997 – June 1998) of 23,570 individuals who made their first-ever purchase at CDNOW in the first quarter of 1997. § Data includes customer id, date of transaction, and amount spent We will focus on data for 1/10th of the whole cohort (i.e. 2,357 customer), spanning 13 months. For the purpose of study, the data is divided into two: Year 1 Purchase (Jan-Dec 1997) and Year 2 purchase (Jan 1998). 22 149 of these customers We have data of made a 2357 customers purchase in who have made Jan 1998 5545 transactions in 1997. 23 24 Can past purchase behaviours predict future purchase behaviours? YEAR 1 What statistical technique could we use? ID Date $ Amount Spent 1 19970101 29.33 Regression type 1 19970118 29.73 model, but which 1 19970802 14.96 predictors…? 1 19971212 26.48 2 19970101 63.34 We can extract the recency, 2 19970113 11.77 frequency, monetary value 3 19970101 6.79 data from the “raw” data 4 19970101 13.97 provided by the company and 5 19970101 23.94 use them as predictors of Year 2 purchase behaviour. 25 RFM Idea: Three Behavioral Attributes We can summarize customer transaction behaviour in YEAR 1 terms of: ID Date $ Amount Spent § Recency: Time elapsed since last purchase. When 1 19970101 29.33 was his last purchase? 1 19970118 29.73 1 19970802 14.96 § Frequency: Frequency of 1 19971212 26.48 purchases in a given 2 19970101 63.34 period. How many purchases did he make? 2 19970113 11.77 3 19970101 6.79 § Monetary value: Amount 4 19970101 13.97 spent on purchases in a 5 19970101 23.94 given period. How much did he spend on average? 26 Key Steps Can past purchase behaviours predict future purchase behaviours? 1) Summarize past behaviour (Year 1) in terms of RFM 2) Use regression methods to predict future behaviour (Year 2 purchase) based on these RFM variables 27 Exercise Consider the different variables (and their measurement) 1) Year 2 purchase 2) Recency purchase Year 1 3) Frequency purchase Year 1RFM 4) Monetary value purchase Year 1 LEFT HAND TWO HANDS RIGHT HAND Yes Don’t know No The Summary Data Number of months since last purchase in Year 1. Ø R, F and M are the predictors in our regression model. 29 The Summary Data Dependent Variable 30 Dependent Variable: Year 2 Purchase (1=Yes, 0=No) Match the IDs of Y1 customers to those of Y2 customers. If there is a match, assign a value of 1, else a value of 0. Year 1 Year 2 31 The Summary Data (created in Excel, copied to SPSS) 32 Key Steps 1) Summarize past behaviour (Year 1) in terms of RFM Data is processed and is ready for analysis! 2) Use regression methods to predict future behaviour (Year 2 purchase) based on these RFM variables What type of regression is suitable for predicting categorical outcomes from continuous (or categorical) predictors? 33 To summarize RFM Analysis: § One of the earliest segmentation techniques used by direct marketers. § Accurate way to predict future behaviour (Berry and Linoff 2004, Bolton 1998, Malthouse 2003) § Easy to perform and comprehend -> The three measures can be computed for any database that has purchase history. Any limitations? § Does not take into account other factors that would help predict future purchase § Predict customer behaviour only in the next period § Past behaviour may be result of firm’s past marketing activities 34 Using Customer Transaction Databases for Answering Simpler Questions We can calculate some general statistics (frequencies): § How many customers bought just once? Twice? Four times? § How many customers spent less than $20? More than $100? 35 Calculating Frequencies and Creating Histograms from the Year 1 Summary Data Cum N transactions Frequency Percentage Percentage 1 1317 55,9% 55,9% 2 437 18,5% 74,4% 3 206 8,7% 83,2% 4 123 5,2% 88,4% 5 78 3,3% 91,7% 6 58 2,5% 94,1% 7 41 1,7% 95,9% 8 19 0,8% 96,7% 9 18 0,8% 97,5% 10 13 0,6% 98,0% More 47 2,0% 100,0% Total 2357 Cum Avg. $ Spent Frequency Percentage Percentage 0 8 0,3% 0,3% 10 97 4,1% 4,5% 20 835 35,4% 39,9% 30 515 21,8% 61,7% 40 337 14,3% 76,0% 50 200 8,5% 84,5% 60 130 5,5% 90,0% 70 58 2,5% 92,5% 80 52 2,2% 94,7% 90 31 1,3% 96,0% 100 26 1,1% 97,1% More 68 2,9% 100,0% Total 2357 100,0% 36 How many customers bought just once? Twice? Four times? 2.35 average transactions 37 How many customers spent less than $20? More than $100? 32.84 average spending 38 Agenda § Introduction to database marketing and data mining § Predicting customer response and RFM paradigm § Introduction to logistic regression 39 What is Logistic Regression? § Logistic Regression is a specialized form of regression that is designed to predict categorical outcomes with two or more categories. § Independent variables can be either categorical or continuous or a mix of both. § Dependent variable is categorical rather than continuous: - If two categories > Binary Logistic Regression (e.g., purchased/not purchased; responded/not responded; clicked/not clicked; Option A chosen/Option B chosen) - If more than two categories > Multinominal Logistic Regression (not covered in this course!) 42 Motivation Response to marketing efforts: did the person buy after being sent an email advertisement? § Y = 1 (yes, responded) § Y = 0 (no, not responded) Category purchase incidence: did the person buy in the orange juice category in a given supermarket visit? How is this related to marketing variables: prices, display and feature advertising? Y = 1 (yes, bought) Y = 0 (no, not bought) Logistic Regression – Objectives § Identifying the independent variables that impact group membership in the dependent variable (i.e. likelihood or probability of an event occurring) -> explain § E.g., What is the impact of recency, frequency, monetary value (RFM) variables on the likelihood of a customer purchasing in the future? § Establishing a classification system based on the logistic model for determining group membership -> predict § E.g., For any customer with certain values of R,F,M, what’s the likelihood that s/he will purchase in the future? 44 Logistic Regression – A General Model § Instead of predicting the value of a variable Y from predictor variables Xn, we predict the probability of a Y occurring given known values of Xn 1 with probability p Y= ! 0 with probability 1 − p § Transforming a probability into odds and logit values: Odds = Probi ÷ (1 − Probi) § In logistic regression, the dependent variable is in fact a logit value which is the log of odds; Logit (P) = Log(Odds) = B0 + B1*X1 + B2X2 + …..+ Bn*Xn Coefficients are estimated using this as the dependent measure. 45 Logistic Regression – Assumptions General lack of assumptions: § Logistic regression does not require any specific distributional form for the independent variables. § Heteroscedasticity (equal variance/covariance structure for the IVs across groups) of the independent variables is not required. § Yet, it is still sensitive to high correlations among the independent variables (multicollinearity). 46 EXERCISE Customer Acquisition Example Research Question: What is the impact of price, email, coupon, and ad spending on customer acquisition? What is/are the dependent variable(s), their unit of analysis and their level of measurement? What is/are the explanatory variable(s), their unit of analysis and their level of measurement? 47 Customer Acquisition Example Research Question: What is the impact of price, email, coupon, and ad spending on customer acquisition? Dependent variable § Y = 1 (customer acquired) § Y = 0 (customer not acquired) Explanatory variables § price – price of product in $ § email – dummy variable (sent/not sent) § coupon – dummy Variable (sent/not sent) § ad spend – $ spent on ad/person 48 49 51 52 Interpretation of Output § Block 0: Baseline model - Results of the analysis without any of the independent variables used in the model § Block 1: Our model - Results of the analysis with our set of independent variables used in the model 53 Interpretation of Output (Overall Model Fit) How well the model performs over and above the results obtained for Block 0. We want sig. results! Usefulness of the model. Between 10% and 16.8% of the variability in the dependent variable is explained by this set of IVs. Measures the correspondence of the actual and predicted values of the dependent variable. We want nonsig. results! 54 Interpretation of Output (Overall Model Fit) Classification accuracy of the model: How well the model is able to predict the category (acquired/not acquired) for each case. 83/83 1/17 (83+1)/100 The model correctly classified 84% of the cases overall (but only correctly classified 5.9% of the people who were acquired. 55 Interpretation of Output (Interpretation of the coefficients) 1. Contribution or importance of each independent variable § Wald statistic is used to assess significance (similar to t- test used in multiple regression) § p-value in the “Sig.” column tests the null hypothesis that the coefficient (B) is 0. We are looking for p-values < 0.05. 56 Interpretation of Output (Interpretation of the coefficients) 2. Direction of the relationships § Original coefficients in the B column tells the relationship between the independent variables and the dependent variable: Check the sign, whether they are positive or negative. § Exponentiated coefficients in the Exp(B) column: Values above 1.0 indicate a positive relationship and below 1.0 indicate a negative relationship [Rationale: exp(0) = 1]. 57 Interpretation of Output (Interpretation of the coefficients) 2. Direction of the relationships Negative Positive Relationship Relationship B 0 Exp(B) 1 Remember exp(0) = 1 58 Interpretation of Output (Interpretation of the coefficients) 3. Magnitude of the relationships § Assessed by the exponentiated coefficient, with the percentage change in the dependent variable shown by: Percentage change in DV = (Exponentiated Coefficient – 1.0) * 100 Change in odds of being in one of the categories of outcome when the value of an independent variable increases by one unit, all other factors being equal. Interpretation of Output (Interpretation of the coefficients) 3. Magnitude of the relationships § Email: Exp(1.438) = 4.212, implying when an email is sent, the odds of being acquired increase by a factor of 4.212 (or by 321%), all other factors being equal. § AdSpend: Exp(.072) = 1.075, implying for every unit increase in ad spending, the odds of being acquired increase by a factor of 1.075 (or by 8%), all other factors being equal. 60 Calculating Probability of Acquiring Each Customer Remember the logistic transformation of the dependent variable: Odds = Probi ÷ (1 − Probi) Logit value = Log(Odds) = B0 + B1*R + B2F + B3*M Coefficients are estimated using this as the dependent measure. Knowing the logistic regression coefficients, we can calculate backwards, first the odds [odds = exp (logit value)] followed by the probabilities [odds/(1+odds)] 61 Calculating Probability of Acquiring Each Customer 1 with probability p y = ! 0 with probability 1 − p exp(b0 +b1 Price+b2 Email+b3 Coupon+b4 Ad) p= 1+exp(b0 +b1 Price+b2 Email+b3 Coupon+b4 Ad) Calculating Probability of Acquiring Each Customer Application and Managerial Use § Imagine that the company wants to predict the likelihood that someone who they offer a price of $2, send no email to, give them a coupon, and spend $30 on ad spend on will become a customer § What is that probability? Application and Managerial Use § Price = $2 § Email = No = 0 § Coupon = Yes = 1 § Ad Spend = $30 exp(−4.111−1.061∗2+1.438∗0+0.447∗1+0.072∗30) § p= = 0.026 1+exp(−4.111−1.061∗2+1.438∗0+0.447∗1+0.072∗30) § Thus, the probability of customer acquisition at this spend is 0.026, or 2.6% § The expected ad spend to yield 1 customer is $30/0.026 = $1153.85 Application and Managerial Use Inferences (sensitivity to change): § imagine that the company is considering a change to $40 ad spend for each customer at the same price, email, and coupon as before § does this make sense? exp(−4.111−1.061∗2+1.438∗0+0.447∗1+0.072∗40) § p= = 0.052 1+exp(−4.111−1.061∗2+1.438∗0+0.447∗1+0.072∗40) § expected cost to yield one customer is $40/0.052 = $769.23 § thus, it might make sense to increase the ad spend Agenda § Introduction to database marketing and data mining § Predicting customer response and RFM paradigm § Introduction to logistic regression 67 Video Tutorial Predicting Customer Response CDNOW Transaction Data 68 Thursday, September 26th Q&A Session DIGITAL Computer room available Friday, September 27th Assignment 4 Deadline 69 Week 5 Monday, September 30th Tuesday, October 1st Understanding Individual Customer Preferences Using Conjoint Analysis

2024 CMA Lecture 7&8_StudentsAfter - Lecture Notes PDF

Document Details

Tags

Related

Summary

Full Transcript