Multiple Discriminant Analysis PDF
Document Details
Tags
Summary
This document defines Multiple Discriminant Analysis (MDA) and shows examples. It covers topics across multiple slides, including defining MDA, survey results, objectives, research design, and various stages of the analysis, which includes calculating discriminant Z scores, classifying objects and evaluating the accuracy. It also shows interpretation of calculated results.
Full Transcript
Discriminant Analysis Defined Multiple discriminant analysis... is an appropriate technique when the dependent variable is categorical (nominal or nonmetric) and the independent variables are metric. The single dependent variable can have two, three or more categories....
Discriminant Analysis Defined Multiple discriminant analysis... is an appropriate technique when the dependent variable is categorical (nominal or nonmetric) and the independent variables are metric. The single dependent variable can have two, three or more categories. Examples: Gender – Male vs. Female Heavy Users vs. Light Users Purchasers vs. Non-purchasers Good Credit Risk vs. Poor Credit Risk Member vs. Non-Member Attorney, Physician or Professor Survey Results for the Evaluation* of a New Consumer Product Purchase Intention Subject X1 X2 X3 Number Durability Performance Style Group 1 Would purchase 1 8 9 6 2 6 7 5 3 10 6 3 4 9 4 4 5 4 8 2 Group Mean 7.4 6.8 4.0 Group 2 Would not purchase 6 5 4 7 7 3 7 2 8 4 5 5 9 2 4 3 10 2 2 2 Group Mean 3.2 4.4 3.8 Difference between group means 4.2 2.4 0.2 *Evaluations made on a 0 (very poor) to 10 (excellent) rating scale. Stage 1: Objectives of Discriminant Analysis 1. Determine if statistically significant differences exist between the two (or 2. Identify the relative importance of each of the independent variables in predicting group membership. 3. Establish the number and composition of the dimensions of discrimination 4. Develop procedures for classifying objects (individuals, firms, products, etc.) into groups, and then examining the predictive accuracy (hit ratio) of the discriminant function to see if it is acceptable (> 25% increase). Stage 2: Research Design for Discriminant Analysis Selection of dependent and independent variables. Sample size (total & per variable). Sample division for validation. Rules of Thumb 5–1 Continued... ✓ have 20 cases per independent variable, with a minimum recommended level of 5 observations per variable. ✓ have at least one more observation per group than the number of independent variables, but striving for at least 20 cases per group. Stage 3: Assumptions of Discriminant Analysis Other Assumptions Minimal multicollinearity among independent variables. Group sample sizes relatively equal. Linear relationships. Elimination of outliers. Stage 4: Estimation of the Discriminant Model and Assessing Overall Fit Selecting An Estimation Method: 1. Simultaneous Estimation – all independent variables are considered concurrently. 2. Stepwise Estimation – independent variables are entered into the discriminant function one at a time. Estimating the Discriminant Function The stepwise procedure begins with all independent variables not in the model, and selects variables for inclusion based on: Statistically significant differences across the groups (.05 or less required for entry), and The largest Mahalanobis distance (D2) between the groups. Assessing Overall Model Fit Calculating discriminant Z scores for each observation, Evaluating group differences on the discriminant Z scores, and Assessing group membership prediction accuracy. Classification Matrix HBAT’s New Consumer Product Predicted Group Would Percent Actual Would Not Actual Correct Group Purchase Purchase Total Classification (1) 22 3 25 88% (2) 5 20 25 80% Predicted 27 23 50 Total Percent Correctly Classified (hit ratio) = 100 x [(22 + 20)/50] = 84% Stage 5: Interpretation of the Results Three Methods: 1. Standardized discriminant weights, 2. Discriminant loadings (structure correlations), and 3. Partial F values. Interpretation of the Results Two or More Functions: 1. Rotation of discriminant functions. 2. Potency index. Graphical Display of Discriminant Scores and Loadings Territorial Map = most common method. Vector Plot of Discriminant Loadings, preferably the rotated loadings = simplest approach. Plotting Procedure for Vectors Three Steps: 1. Selecting variables, 2. Stretching the vectors, and 3. Plotting the group centroids. Rules of Thumb 5–4 Interpreting and Validating Discriminant Functions Discriminant loadings are the preferred method to assess the contribution of each variable to a discriminant function because they are: ✓ a standardized measure of importance (ranging from 0 to 1). ✓ available for all independent variables whether used in the estimation process or not. ✓ unaffected by multicollinearity. Loadings exceeding ±.40 are considered substantive for interpretation purposes. If there is more than one discriminant function, be sure to: ✓ use rotated loadings. ✓ assess each variable’s contribution across all the functions with the potency index. The discriminant function must be validated either with a holdout sample or one of the “Leave one out” procedures. Stage 6: Validation of the Results Utilizing a Holdout Sample. Cross-Validation