Week 3 - Segmentation and Positioning - Fall 2024 Marketing Analytics PDF
Document Details
Uploaded by UndamagedParable
Columbia Business School
2024
null
Hortense Fong
Tags
Summary
This document is lecture notes for a marketing analytics course titled "Marketing Analytics." It covers topics such as segmentation, positioning, factor analysis, and perceptual maps.
Full Transcript
B9651 – Marketing Analytics Session 3: Segmentation & Positioning Professor Hortense Fong Logistics Clustering Concept Check solutions will be posted Wed 1PM Random_state implementation depends on hardware and package version Group Assignment 1 (Ford Ka) due Monday, Sep 23 at 8PM No ne...
B9651 – Marketing Analytics Session 3: Segmentation & Positioning Professor Hortense Fong Logistics Clustering Concept Check solutions will be posted Wed 1PM Random_state implementation depends on hardware and package version Group Assignment 1 (Ford Ka) due Monday, Sep 23 at 8PM No need to format as a report, just need to answer the questions 4 page max does not include tables and graphs, which can be in an appendix Contribution ratings Last Time… Segmentation Targeting S T Discovering and Evaluating profiling groups of segment customers with attractiveness and similar needs and targeting most Where will we play? preferences attractive ones Last Time… Segmentation Dividing market into meaningful subsets of customers Cluster Analysis as a technique to group entities such that Objects within a group should be as similar as possible Objects belonging to different groups should be as dissimilar as possible Large, Identifiable, Distinctive, Stable (LIDS) and actionable (managerially relevant) Targeting Choosing attractive segments (customers, company, competitors) Key learnings: Segmentation and targeting and type of data used How to segment the market (intuition + implementation) Today: segmentation and positioning Course Roadmap STP Analytics Customer Analytics 4P Analytics (Identify Value) (Deliver Value) (Capture Value) Module 1 Module 2 Module 3 What datasets can we use? How much are our customers How do we build a new product? worth? How can we segment and target How should we price our products? our customers? Are our customers leaving? How do we distribute them? How should we position our How do our customers make How do we quantify the impact of products/services? choices? our promotions? Today: Segmentation + Positioning Part 1: Dimension Reduction Techniques 1. Big Data, Companies’ Perspectives 2. Factor Analysis + Implementation in Python 3. Novel Techniques 1. Latent Dirichlet Allocation 2. (Variational) Autoencoders Part 2: Application to Segmentation 1. DuPont Part 3: Application to Positioning 1. Perceptual Maps (Beers) Today’s Goals Understand: Factor analysis and recent dimension reduction techniques How to perform segmentation with a large number of variables How to build a perceptual map Be able to: Implement factor analysis in Python Choose a suitable number of factors Interpret the results of factor analysis Dimension Reduction Why? Big Datasets Last time: cluster analysis using 5 variables (questions) But an important question remains: What if we have 800 variables? Example: How much data can Netflix collect about you? What you watch? When? Do you stop often? “Old-school” watcher vs binge watcher? What is the issue? Hard to interpret Complexity Some variables may be correlated Company’s Perspective Imagine a fashion brand performed a segmentation analysis of luxury brands Segments along low vs high price and mass market vs exclusivity What does the brand need? Understand where the brand lives in customers’ minds Multicollinearity - Bank Service Survey Scale from 1 (Strongly Disagree) to 10 (Strongly A national bank wants to create a Agree): new “spin-off” brand, to target Q1: Small banks charge less than large banks. certain segments of the personal Q2: Large banks are more likely to make mistakes banking market. than small banks. Q3: Tellers do not need to be extremely courteous To help position this new brand, and friendly; it’s enough for them simply to be civil. they conducted a survey about Q4: I want to be known personally at my bank and be consumers’ attitudes toward treated with special courtesy. banking. Q5: If a financial institution treated me in an impersonal or uncaring way, I would never patronize that organization again. Regression Analysis: Bank Data Y = 𝛽! + 𝛽"𝑋" + 𝛽#𝑋# + 𝛽$𝑋$ + ⋯ + 𝛽% 𝑋& 𝑌= 𝑋s = 𝛽s = Dependent variable Predictor variables Coefficients Outcome variable Features Parameters Independent variables Note the jargon! Number of Survey times you go questions to the bank Linking Attitudes to Behaviors Question: do consumers’ We can use a linear regression to attitudes explain how much estimate the linear relationship between customers use the bank? these variables and banking activity 𝛽s 𝑌 𝑋s Multicollinearity! Multicollinearity – Correlated Questions Are these five questions truly independent, or are they measuring the same thing? Can we convert many questions into a few independent factors? Dimension Reduction How? Factor Analysis Factor Analysis Factor analysis is a statistical technique that aims to Replace an initial set of variables with a smaller number of “factors” Factors reflect what sets of variables have in common with one another One of multiple data or dimension reduction techniques Multidimensional Scaling T-SNE Autoencoders… Factor Analysis: Intuition Starting point: a set of variables (questions): 𝑋", 𝑋# , … , 𝑋* 𝑋", 𝑋# , … , 𝑋* may be derived from a few underlying “concepts” or factors Factor 1 Factor 2 Factor 3 We have to work backwards! 𝑋" 𝑋# 𝑋$ 𝑋+ 𝑋, … 𝑋* Problem: we observe 𝑋", 𝑋# , … , 𝑋* , but not the factors! à Goals of factor analysis: what are these factors, how many are there, and how do they relate to the original X’s? What are the Factors? Our customers say we Our customers say we are doing well on Q1, are doing well on value Q2, Q3, but poorly on vs. but poorly on quality Q4, Q5, Q6, and on Q7 and experience. and Q8. These are factors! Interpretation: Q1, Q2, Q3 are measuring value Q4, Q5, Q6 are measuring quality Q7, Q8 are measuring experience Back to Banking: Survey Scale from 1 (Strongly Disagree) to 10 (Strongly A national bank wants to create a Agree): new “spin-off” brand, to target Q1: Small banks charge less than large banks. certain segments of the personal Q2: Large banks are more likely to make mistakes banking market. than small banks. Q3: Tellers do not need to be extremely courteous To help position this new brand, and friendly; it’s enough for them simply to be civil. they conducted a survey about Q4: I want to be known personally at my bank and be consumers’ attitudes toward treated with special courtesy. banking. Q5: If a financial institution treated me in an impersonal or uncaring way, I would never patronize that organization again. What is the Factor Underlying Q1 & Q2? On a scale from 1 (Strongly Disagree) to 10 (Strongly Agree): Q1: Small banks charge less than large banks. Q2: Large banks are more likely to make mistakes than small banks. What is the Factor Underlying Q3, Q4 & Q5? Q3: Tellers do not need to be extremely courteous and friendly; it’s enough for them simply to be civil. Q4: I want to be known personally at my bank and be treated with special courtesy. Q5: If a financial institution treated me in an impersonal or uncaring way, I would never patronize that organization again. Back to Banking… Two factors: Factor analysis looks for these “blocks” of correlation: Q1-2: “Smaller banks are better” High correlation within blocks Q3-5: “Personal touch” Low correlation across blocks Intuition: Clusters vs Factors Cluster Analysis: Factor Analysis: using the columns to using the rows to group the rows group the columns Factor Analysis: The Math Assumption: each variable(𝑋", 𝑋# , … , 𝑋* ) can be represented as a linear combination of K underlying factors, 𝐹", 𝐹# , … , 𝐹- 𝑋" = 𝑙""𝐹" + 𝑙"#𝐹# + ⋯ + 𝑙"- 𝐹- + 𝜖" 𝑋# = 𝑙#"𝐹" + 𝑙##𝐹# + ⋯ + 𝑙#- 𝐹- + 𝜖# ⋮ 𝑋* = 𝑙*" 𝐹" + 𝑙*#𝐹# + ⋯ + 𝑙*- 𝐹- + 𝜖* “coefficients” = factor loadings (i.e., how much does that factor explain the X?) Basically, a regression where X is the dependent variable and the factors are the independent variables! But we do not know the factors. What Makes a Useful Factor Structure? Factors retain as much of the original information as possible, with the fewest number of factors Dimensionality reduction: # factors 1, Cumulative Variance>80%, scree plot, managerial relevance 3. Compute the rotated factor loading matrix to understand (and name!) the underlying factors 4. Compute the factor scores: 1. For each observation, what are the values of the factors? Banking: Full Analysis We use as many factors as variables Step 1: PCA, all components, We will use the package factor_analyzer no rotation Banking: Full Analysis Step 1: PCA, all components, We will use the package factor_analyzer no rotation Step 2: Determine the number of components to keep Criteria for keeping a component: Variance (Sum of Squares Loadings/Eigenvalues) > 1 Cumulative var > 80% Banking: Full Analysis Step 1: PCA, all components, no rotation Scree Plot Step 2: Determine the number of components to keep Criteria for keeping a component: Variance (Sum of Squares) > 1 Cumulative var > 80% Scree plot elbow (elbow – 1) Banking: Full Analysis Selected # factors Step 1: PCA, all components, no rotation Step 2: Determine the number of components to keep Step 3: Understand the retained, rotated factors Banking: Full Analysis Factor loadings Q1, Q2 load highly on factor 2 The “Small feel” Factor -Q3, Q4, Q5 load highly on factor 1 Step 1: PCA, all components, The “Personal touch” Factor no rotation Step 2: Determine the number of components to keep Step 3: Understand the retained, rotated factors Q5 = 0.98 * RC1 – 0.08 * RC2 Communalities (h2) How much of the variance in the original variable is captured by the common factors? 97.4% of the variance in Q1 is explained by RC1 and RC2 Varimax Clarification Varimax does not change the total amount of variance explained No Rotation With Varimax Rotation Important: PCA without rotation should be used to determine how many factors there are (once we rotate, we are changing the structure of the data), rotation helps with interpretation Banking: Full Analysis Step 1: PCA, all components, no rotation Step 2: Determine the number of components to keep Step 3: Understand the Respondent 3’s Factor Scores retained, rotated factors These are the standardized scores (z-scores) Step 4: Compute the factor For this respondent: RC1 = 1.28, scores (translation of original RC2 = 1.02 data into factors) Interpretation: Respondent 3 Multiply standardized data scores 1.28 SD above the mean by principal components for RC1, 1.02 SDs above the mean for RC2. Jargon Summary Loadings = how the original variables relate to the factors E.g., Q5 = 0.98 * RC1 – 0.08 * RC2 Communalities = how much variability in the original variables is explained by the factors E.g., communality of Q1 is 0.974 à Using 2 components is sufficient to approximate 97.4% of the variation in Q1 (i.e., the error that remains is small) Scores = translation of original data into factors E.g., Respondent 3 scores 1.28 SD above the mean for RC1, 1.02 SDs above the mean for RC2 Let’s go to Python PCA Back to our Problematic Regression One solution: Principal Components Regression Super easy: replace original X variables with factor scores (reduced dimension X’s)! Remember the goals of factor analysis: o Reduce # variables o Retain same information Multicollinearity! Factors are uncorrelated à no multicollinearity How can we fix this? Back to our Problematic Regression Factor scores Interpretation? Scoring higher on both factors is Favoring small banks and need for significantly associated with higher personal touch are both significantly activity associated with higher activity Takeaway: Factor Analysis Basics The goal of factor analysis: uncover underlying structure between many variables Good factors: uncorrelated, capture as much of the original variance as possible Factors are often intuitive, easier to use, and managerially interesting Dimension Reduction Techniques for Unstructured Data Topic Modeling Automatic summarization of documents through topics Statistical definition: topic = set of commonly co-occurring words Example: in tablet reviews, “Apple, iPad, iTunes, Mac” = Apple topic Intuition: factor analysis for documents! many words à few interpretable topics Uses: o Information retrieval and automatic labeling o Discovering patterns o Predicting outcomes from topics Most common model: Latent Dirichlet Allocation (LDA) In Python: sklearn, nltk, gensim One use: as input to regression! “Which topics are predictive of Latent Dirichlet Allocation (LDA) my outcome?” Output 1: Which words belong to which topics (i.e., what are the topics)? Note: You have to set the number of topics in advance! Topic 2: "kindl" Topic 4 "screen" Topic 7 "great" Topic 8 "ipad" Topic 10 "problem" "fire" "amazon" "good" "touch" "product" "love" "like" "much" "work" "day" "read" "book" "nice" "like" "purchas" "bought" "appl" "use" "back" "tri" Output 2: Which topics best describe each document (i.e., what percentage of the words in a given document are from topic 1, topic 2, …)? Topic Proportion 1 0.09 I love my fire and highly recommend it to anyone who wants to watch 2 0.15 videos (netflix, hulu, amazon), read ebooks (purchased or from the 3 0.10 local library), surf the net and play games. I work in the tech field and I 4 0.05 LOVE apple entertainment products (I own many apple products and 5 0.05 at work I work with several). I am very thrilled with my fire (I LOVE IT 6 0.07 TOO!) because it works great as an entertainment product (and more 7 0.17 affordable than my apple products). I also think the fire is a great 8 0.12 product because of Amazons cloud and support 9 0.07 10 0.12 Session 8 - 56 Image Analysis What Makes Art Valuable? (1957) (1978) Mark Rothko (1903-1970) Theodoros Stamos (1922-1997) No. 17 Infinity Field, Lefkada Series #4 $32,645,000 (Christies 2016) $10,625 (Christies 2013) Post-War and Contemporary Art Evening Sale Interiors Convolutional Neural Networks Go-to algorithms for computer vision tasks Dominates ImageNet competition “Convnets” learn: translation invariant patterns spatial hierarchy of patterns How Do “Convnets” Work? Variational Auto-Encoders Deep generative model assumes images generated by statistical process VAE contains two parts: Encoder: takes image as input and compress its information in latent parameters Decoder: takes latent space representation as input and outputs a reconstitution of the original image Latent parameters are used as predictors Example – 100 Factors - The Scream Decrease Increase factor 2 factor 2 Parameter captures reddish-hue in upper part of the painting What Can We Do With This Information? For art: See which features correlate with higher prices See which paintings were most influential and creative over time For Marketing? Application of Factor Analysis to Segmentation Context: DuPont and B2B Marketing DuPont is a large chemical company that sells industrial chemicals, synthetic fibers, pharmaceuticals, building materials, agricultural chemicals, etc. as inputs to other businesses’ manufacturing. Context: DuPont and B2B Marketing Demographic/Background DuPont collected mail Exp1 = interest in exporting (1=L, 2=M, 3=H) survey data for 58 Size = Number of employees in thousands respondents Revenue = Amount sold to that company by DuPont in $MM One set of questions was Years = Number of years as a DuPont customer about interests of the Numprod = Number of products that they buy from DuPont company, their size, etc. Survey Q1-Q4 = questions about quality One set of questions TS1-TS3 = questions about tech support involved satisfaction with SM1-SM2 = questions about sales and marketing support DuPont in 5 primary areas SD1-SD7 = questions about supply and delivery INN1-INN3 = questions about innovation DuPont Analysis Goals Segmentation: are there different segments in the existing customer base, and if so, how do they differentially drive revenue? Feedback: Can DuPont make this survey better next time? Cluster Analysis Perhaps the drivers of revenue differ by segment K-means clustering with 2 clusters Too many variables! It won’t be easy to come up Solution: factor analysis with clear segment descriptions. Example Question Rating Why do we need PRODUCT QUALITY factor analysis? 01 THE RANGE OF CHOICES IN THE _______ (product) PRODUCT LINE 02 THE CONSISTENCY OF (product) _______ QUALITY FROM LOT TO LOT 03 THE WAY (product) PROCESSES IN _______ YOUR MANUFACTURING OPERATIONS 04 THE WAY (product) PERFORMS IN _______ YOUR FINISHED PRODUCTS Let’s go to Python DuPont Factor Analysis In groups, perform k-means clustering with 2 groups on DuPont survey data. 15 minutes Set random_state = 1690 Steps to Factor Analysis with PCA 1. Estimate all the principal components (without rotation) 2. Determine the number of components (factors) to keep: 1. Eigenvalues>1, Cumulative Variance>80%, scree plot, managerial relevance 3. Compute the rotated factor loading matrix to understand (and name!) the underlying factors 4. Compute the factor scores: 1. For each observation, what are the values of the factors? Four Factor Solution Unrotated PCA to determine # of factors 68% variance explained Four Factor Solution Notice: several communalities are pretty low (< 0.6) Don’t give high weight to these questions when interpreting factors Consider including these separately in analysis Four Factor Solution Interpreting the factors: 1. Combination of: sales and marketing support complaints handling innovativeness Basically: a “general competence factor” or maybe “customer centricity” Four Factor Solution Interpreting the factors: 1. Combination of: sales and marketing support complaints handling innovativeness Basically: a “general competence factor” 2. Quality Four Factor Solution Interpreting the factors: 1. Combination of: sales and marketing support complaints handling innovativeness Basically: a “general competence factor” 2. Quality 3. Delivery Four Factor Solution Interpreting the factors: 1. Combination of: sales and marketing support complaints handling innovativeness Basically: a “general competence factor” 2. Quality 3. Delivery 4. Technical expertise Four Factor Solution Interpreting the factors: 1. Combination of: sales and marketing support complaints handling innovativeness Basically: a “general competence factor” 2. Quality 3. Delivery Q1-Q4 = questions about quality 4. Technical expertise TS1-TS3 = questions about tech support SM1-SM2 = questions about sales and marketing support Not what the survey designer SD1-SD7 = questions about supply and delivery expected! INN1-INN3 = questions about innovation Interpretable Segmentation Idea: run cluster analysis on the factor scores Remember: factor scores are already normalized Interpretation: segments differ in their ratings along the four factors! Segments 1. The Mainstream 2. The Haters Summary: Segmentation Factor analysis can be used to: Uncover structure in survey questions Reduce dimensionality for more interpretable segmentation Focus attention on key factors Factor and cluster (and regression) analysis can all be linked! Real data is messy: very rarely is the story clean cut Application to Positioning Where will we play? How will we win? Segmentation Targeting Positioning S T P Discovering and Evaluating Defining value profiling groups of segment proposition for target customers with attractiveness and segments and similar needs and targeting most developing a preferences attractive ones marketing plan What is Positioning? Placing the product/service with respect to alternatives in the mind of the customer (Reis and Trout) Positioning statement Who is the product for? What does the product have to offer? How is the product different? In other words: “How does my company deliver value to my (target) customer better than the competition” Product Differentiation & Positioning “There is no such thing as a commodity” Theodore Levitt “No matter how commonplace a product may appear, it does not have to be a commodity. Every product, every service can be differentiated” Dermot Dunphy, CEO, Sealed Air Differentiation can be achieved on Product benefits Quality of customer service Psychological associations of brand, … Positioning: the image created in the minds of target consumers relative to other brands in the category Factor Analysis: Relevance to Positioning Understanding how many variables in a dataset capture a few unique constructs can be helpful for brand positioning What factors are relevant in determining a brand’s position in the marketplace? Tool: Perceptual Maps for Brands What Actually is a Perceptual Map? Visual representation of how target customers view competing alternatives Dimensionality reduction: Attitudes, opinions, survey questions, … à two dimensions Characteristics: Axes: underlying dimensions characterize how customers differentiate among alternatives Distance: pairwise distances between alternatives directly indicate how close or far apart the products are in the minds of customers Uses of Perceptual Maps Understanding market structure in the minds of consumers How do my customers perceive my brand? Who are my key competitors? How can I communicate my brand positioning in a way that is consistent with my customer’s views, or how I want my brand to be seen? Problem detection: do people see us like we see us? Differentiated positioning and/or new product development Look for a “hole” in the map! (But also ask: why is there a hole?) Perceptual Maps – Beer Brands Example: Rate 20 different beers on 6 dimensions: Taste, Refreshing, Quality, Alcohol Content, High Class, Expensive Average scores by beer: Beer Taste Refreshing Quality Alcohol Class Expensive Budweiser 1.6 2.4 1.4 2.5 1.1 1.4 Bud Light 1.1 2.8 1.1 1.4 1.5 1.1 Miller Light 1.4 2.5 1.1 1.1 1.4 1.5 … Stella Artois 4.1 3.4 2.8 2.6 4.6 4.4 Victory Lager 4.9 4.1 4.9 3.8 3.6 4.6 Chimay 4.4 2.5 4.9 4.9 4.8 4.6 What dimensions underlie consumers’ judgements? Steps: Factor Analysis for Perceptual Maps 1. Use factor analysis to convert many judgments into 2+ underlying dimensions 1. Two factors is ideal: results in a single perceptual map (two dimensions) 2. More than two factors requires one map per pair of factors 2. Name and interpret the dimensions à axes of the map 3. Plot the factor scores à positions on the map Let’s make a map using Python! Beer: Factor Analysis Force two factors Two factors explain a high proportion of the variance Factor 1: Quality Factor 2: Refreshing Perceptual Map: Plot the Factor Scores Blue Moon’s factor scores Projection of variables onto 2D space of first 2 components Angle: closer angle = higher positive association Direction: association of variables with components Perceptual Map: Plot the Factor Scores If Corona wants to be viewed as higher quality, what are some marketing strategies to doing so? Summary: Positioning Perceptual maps are useful tools for understanding positioning and developing brand and product strategy Understanding the competitive environment in the minds of consumers Look for “holes” (but always ask why!) Data-driven perceptual maps can be created through factor analysis: Dimensions = factors Map positions = factor scores Takeaways: Factor Analysis Factor analysis: turn complex data into intuitive and meaningful factors Lots of jargon: loadings, variance explained, scores, … How it works: principal components analysis Many applications: Simplifying data for easy description Factor + cluster analysis for segmentation Positioning through perceptual maps Next Class Positioning Concept Check due before next class We will review Ford Ka and study CLV