Multinomial Probit and Logit Models
Document Details
Uploaded by ClearerKoala
Ani Katchova
Tags
Summary
This document provides an overview of multinomial probit and logit models, including conditional and mixed models. It details different types of dependent and independent variables and model applications, such as the choice of insurance contract, products, or occupations.
Full Transcript
Multinomial Probit and Logit Models Conditional Logit Model Mixed Logit Model Ani Katchova © 2013 by Ani Katchova. All rights reserved. 2 Multinomial, Conditional, and Mixed Models Overview Multinomial outcome dependent variable (in wide and long form of data sets) Indepen...
Multinomial Probit and Logit Models Conditional Logit Model Mixed Logit Model Ani Katchova © 2013 by Ani Katchova. All rights reserved. 2 Multinomial, Conditional, and Mixed Models Overview Multinomial outcome dependent variable (in wide and long form of data sets) Independent variables (alternative-invariant or alternative-variant) Multinomial logit model (coefficients, marginal effects, IIA) and multinomial probit model Conditional logit model (coefficients, marginal effects) Mixed logit model 3 Multinomial, Conditional and Mixed Models Multinomial outcome examples The type of insurance contract that an individual selects. The product that an individual selects (say type of cereal). Occupational choice by an individual (business, academic, non-profit organization). The choice of fishing mode (beach, pier, private boat, charter boat). Multinomial outcome dependent variable The dependent variable y is a categorical, unordered variable. An individual may select only one alternative. The choices/categories are called alternatives and are coded as j =1, 2, …, m. The numbers are only codes and their magnitude cannot be interpreted (use frequency for each category instead of means to summarize the dependent variable). The data are usually recorded in two formats: a wide format and a long format. When using the wide format, the data for each individual i is recorded on one row. The dependent variable is: U LF 4 When using the long format, the data for each individual i is recorded on j rows, where j is the number of alternatives. The dependent variable is: U ? L \1 if U L F 0 if U M F Therefore, U ? L1 if the alternative j is the observed outcome and the remaining U ? L0. For each observation only one of U 5, U 6, …, U ? will be non-zero. Example for multinomial data in wide form Person ID (i) Dependent variable (y) Codes for y w i (income) x i1 (price of alternative 1) x i2 (price of alternative 2) 1 apple juice (alternative 1) y=1 40,000 2.5 1.5 2 orange juice (alternative 2) y=2 38,000 2.7 1.7 3 orange juice (alternative 2) y=2 50,000 2.9 1.6 5 Example for multinomial data in long form Person ID (i) Dependent variable (y j) Codes for y j w i (income) x ij (price) 1 apple juice (alternative 1) y 1 = 1 40,000 2.5 1 orange juice (alternative 2) y 2 = 0 40,000 1.5 2 apple juice (alternative 1) y 1 = 0 38,000 2.7 2 orange juice (alternative 2) y 2 = 1 38,000 1.7 3 apple juice (alternative 1) y 1 = 0 50,000 2.9 3 orange juice (alternative 2) y 2 = 1 50,000 1.6 The multinomial density for one observation is defined as: B: U; LL 5? - H… HL ?? ? L ? L ?? ? ? ?@5 The probability that individual i chooses the jth alternative is: L ?? Lpr>U ? LF? LF ?: ?,?; The functional form of F j should be selected so that the probabilities lie between 0 and 1 and sum over j to one. Different functional forms of F j lead to multinomial, conditional, mixed, and ordered logit and probit models. 6 Independent variables Two types of independent variables. Alternative-invariant or case-specific regressors –the regressors S ? vary over the individual i but do not vary over the alternative j. o Income, age, and education are different for each individual but they do not vary based on the type of a product that the individual selects. o Used in the multinomial logit model. Alternative-variant or alternative-specific regressors – the regressors T ?? vary over the individual i and the alternative j. o Prices for products vary for each product and individuals may also pay different prices. o Salaries for occupation may be different between occupations and also for each individual. o Used in the conditional and mixed logit models. 7 Multinomial logit model The multinomial logit model is used with alternative-invariant regressors. The probability that individual i will select alternative j is: L ?? LL: U ? LF; Lexp : ?′??; ∑ exp : ?′??; k i@5 This model is a generalization of the binary logit model. The probabilities for choosing each alternative sum up to 1, ∑ L ?? k h@5 L1 One set of coefficients needs to be normalized to zero to estimate the models (usually ? 5 L0;, so there are (j-1) sets of coefficients estimated. The coefficients of other alternatives are interpreted in reference to the base outcome. Coefficient interpretation for alternative j: in comparison to the base alternative, an increase in the independent variable makes the selection of alternative j more or less likely. 8 Marginal effects The marginal effect of an increase of a regressor on the probability of selecting alternative j is: ?L ?? ? ? ⁄ LL ?? :? ? F? * %; The marginal effects do not necessarily correspond in sign to the coefficients (unlike the binary logit or probit model). There are (j-1) sets of coefficients because one set is normalized to zero, but there are j sets of marginal effects. Depending on which alternative we select as a base category, the coefficients will be different (in reference to the base category) but the marginal effects will be the same regardless of the base category. The marginal effects of each variable on the different alternatives sum up to zero. Marginal effects interpretation: each unit increase in the independent variable increases/decreases the probability of selecting alternative j by the marginal effect expressed as a percent. 9 Independence from Irrelevant Alternatives (IIA) property The odds ratios in the multinomial logit models are independent of other alternatives. For choices j and k, the odds ratio only depends on the coefficients for choices j and k. Odds ratio: L ?? L?? ⁄ Lexp @ ?′ k? ? F? ? oA This weakness of the multinomial model is known as the red bus-blue bus problem. If the choice is between a car and a blue bus, according to the model the introduction of a red bus will not change the probabilities. Multinomial probit model The multinomial probit model is similar to multinomial logit model, just like the binary probit model is similar to the binary logit model. The difference is that it uses the standard normal cdf. The probability that observation i will select alternative j is: L ?? LL: U ? LF; LΦ : ??′ ?; It takes longer for a probit model to obtain results. The coefficients are different by a scale factor from the logit model. The marginal effects will be similar. 10 Conditional logit model The conditional logit model is used with alternative-invariant and alternative-variant regressors. The probability that observation i will choose alternative j is: L ?? LL: U ? LF; Lexp : ??′ ? E ?′??; ∑ exp : ??′ ? E ?′??; k i@5 where ?? are alternative-specific regressors and ? are case-specific regressors. The conditional logit model has (j-1) sets of coefficients (? ?) (with one set being normalized to zero) for the case-specific regressors and only one set of coefficients (?) for the alternative- specific regressors. The probabilities for choosing each alternative sum up to 1. Coefficients for the alternative-invariant regressors ? ? (similar treatment as the multinomial logit model). o One set of coefficients for the alternative-invariant regressors is normalized to zero (say ? 5 L0;, this is the base outcome. The rest of coefficients are interpreted in relation to this base category. 11 o There are (j-1) sets of coefficients (corresponding to the number of alternatives minus 1 for the base). o Coefficient interpretation for alternative j: in comparison to the base alternative, an increase in the independent variable makes the selection of alternative j more or less likely. Coefficients for the alternative-specific regressors (?). o No normalization is needed. o One set of coefficients across all alternatives. o Coefficient interpretation: an increase in the price of one alternative decreases the probability of choosing that alternative and increases the probability of choosing other alternatives. 12 Marginal effects The marginal effect of an increase of a regressor on the probability of selecting alternative j is: ?L ?? ? ?? ⁄ LL ?? :? ??? FL ?? ;? where ? ??? L1 if j=k and 0 otherwise. There are j sets of marginal effects for both the alternative-specific and case-specific regressors. For each alternative-specific variable ?? , there are jxj sets of marginal effects. The marginal effects of each variable on the different alternatives sum up to zero. Marginal effects interpretation: each unit increase in the independent variable increases the probability of selecting the kth alternative and decreases the probability of the other alternatives, by the marginal effect expressed as a percent. 13 Mixed logit model The mixed logit model (also called random parameters logit model) specifies the utility to the ith individual for the jth alternative to be: 7 ?? L ??′ ?? E ?′??? EA ?? L ??′ ? E ?′?? E ??′ ?? E ?′??? EA ?? where A ?? are iid extreme value (similar to the errors in the conditional logit model). The mixed logit model allows for the parameters ? ? to be random. A common assumption is that ? ? L? E? ? where ? ?~Ν>0,Σ ? and ? ?? L? ? E? ?? where ? ?? ~Ν>0,Σ ? ?. The introduction of the random parameters has the attractive property of inducing correlation across alternatives. The combined error ??′ ?? E ?′??? EA ?? is now correlated across alternatives, say Cov c? ?? ,? ?? g L ??′ Σ ?? . The probability that individual i selects alternative j represents a mixed logit model: L ?? LL: U ? LF; Lexp : ??′ ? E ?′?? E ??′ ?? E ?′??? ; ∑ exp : ??′ ? E ?′?? E ??′ ?? E ?′??? ; k i@5 14 The mixed logit model relaxes the IIA assumption by allowing parameters in the conditional logit model to be normally (or log-normally) distributed. When estimating the mixed logit model, the researcher needs to specify which parameters will be estimated as random. If a parameter is random, this implies that effect of a particular regressor on the chosen alternative varies across the individuals. The mixed logit model produce random parameters coefficients for both the regressor (x i) and the standard deviation of the regressor (sd(x i)). Coefficient interpretation for the regressors (x i): when the independent variable increases, the consumers are more or less likely to choose this alternative. Coefficient interpretation on the standard deviation of a regressor (sd(x i)): there is a heterogeneity across individuals with respect to the effect of the independent variable on the alternative chosen.