Lecture 7: Sales Revenue Analysis PDF

# Lecture 7 ## Monday, 18 November 2024 1:08 PM - Recall the example on sales revenues of a shop $S¡ = β₁ + β2P; + β3A; + e¡$ | Variable | Coefficient | Std. Error | t-Statistic | Prob. | |---|---|---|---|---| | C | 118.9136 | 6.3516 | 18.7217 | 0.0000 | | PRICE | -7.9079 | 1.0960 | -7.2152 | 0.0000 | | ADVERT | 1.8626 | 0.6832 | 2.7263 | 0.0080 | $R^2 = 0.4483$ $SSE= 1718.943$ $ô= 4.8861$ $s= 6.48854$ ## Testing a single coefficient in MLR model - We are interested to verify the importance of price $H_0: β_2 = 0$ $H_1: β_2 ≠ 0$ -Test statistic under $H_0$ $t = \frac{b_2}{se(b_2)} \sim t_{n-K}$ -Logic: checking whether $b_2$ is greater than what could be obtained simply by chance. -Distinguish statistical significance vs numerical magnitude. -If we decide for $\alpha = 0.05$, the two critical values that isolate 0.025 probability on each tail of the distribution are $t(0.975,72) = 1.993$ and $t(0.025,72) = -1.993$ -The sample value of the t statistic is $t = \frac{-7.908}{1.096} = -7.215$ -The associated two-sided p-value is $P(t_{72} > 7.215 | H_0) + P(t_{72} < -7.215 | H_0) = 2 \cdot (2.2 \times 10^{-10}) \approx 0.000$ -We reject $H_0$ since $-7.215 < -1.993$, or since $pv(t) = 0.000 < 0.05 = α $ -Hence, the data suggests that sales depends on price; the hypothesis that price does not matter is rejected by the data with a confidence of 95%. ## Testing a single coefficient in MLR model -If we want to test whether sales revenues are related to advertising expenditure $H_o : β_3 = 0$ $H_1: β_3 ≠ 0$ -Test statistic under $H_o$ $t = \frac{b_3}{se(b_3)} \sim t_{n-K}$ -Using a 5% significance level we reject $H_o$ if $|t| > 1.993$, or alternatively if $pv(t) < 0.05$ -The sample value of the t statistic is $t = \frac{1.8626}{0.6832} = 2.726$ -The associated two-sided p-value is $P(t_{72} > 2.726 | H_0) + P(t_{72} < −2.726 | H_0) =$ - $P(|t| > 2.726 | H_0) = 2 \cdot 0.004 = 0.008 $ -We reject $H_o$ since $2.726 > 1.993$, or since $pv(t) = 0.008 < 0.05$ -Hence, the data support the conjecture that sales are affected by advertising expenditure. ## Testing a single coefficient in MLR model $S₁ = β₁ + β2P; + β3A; + e¡$ -We want to test whether demand is price-elastic, knowing that revenues are defined as $S = Q \cdot P$ and $ ε = \frac{\Delta Q \cdot P}{\Delta P \cdot Q}$ $H_0: β_2 ≥ 0 $ $ε ≤1$ $H_1: β_2 < 0$ $ε > 1$ -Test statistic under $H_0$ $t = \frac{b_2}{se(b_2)} \sim t_{n-K}$ -Using a 5% significance level we reject $H_0$ if $t < -1.668$, or alternatively if p(t) < 0.05 -The sample value of the t statistic is $t = \frac{-7.908}{1.906} = -7.215$ -The associated p-value is $P(t_{72} < -7.215 | H_0) = 0.000$ -We reject $H_0$ since $-7.215 < -1.666$ or since $pv(t) = 0.000 < 0.05$ ## Goodness-of-fit -Coefficient of determination in MLR model $\newline$ $R^2 = \frac{SSR}{SST}= \frac{\sum_{i=1}^{N}(y_i - \bar{y})^2}{\sum_{i=1}^{N}(y_i- \bar{y})^2} = 1 - \frac{SSE}{SST} = 1 - \frac{\sum_{i=1}^{N}e_i ^2}{\sum_{i=1}^{N}(y_i- \bar{y})^2}$ $\newline$ -Fitted value $\newline$ $\hat{y}_i = b_1 + b_2x_{i2} + b_3x_{i3} + ... + b_kx_{ik}$ $\newline$ -Total standard deviation of dependent variable $\newline$ $\hat{σ_y} = \sqrt{\frac{1}{N-1}\sum_{i=1}^{N}(y_i - \bar{y})^2} = \sqrt{\frac{SST}{N-1}}$ $\newline$ -So $SST = (N - 1)\hat{σ^2}$ $\newline$ -In the shop example $\newline$ $SST = 74 \cdot 6.448^2 = 3115.485$ $\newline$ $SSE = 1718.743$ $\newline$ -So, we obtain $\newline$ $R^2= 1 - \frac{SSE}{SST} = 1 - \frac{1718.743}{3115.485} = 0.448$ $\newline$ -Advantages of $R^2$ $\newline$ * Unit-free * Bounded measure * Concise $\newline$ -Problems of $R^2$ $\newline$ * Adding a regressor will never reduce the $R^2$; so, tendency to over-fitting. * Does not say if appropriate regressors have been chosen. * Not suitable to compare models with different dependent variable. * Model must include intercept, otherwise SST ≠ SSR + SSE. * What is a satisfactory $R^2$ depends on the field. $\newline$ -The adjusted $R^2$ is fairer since it penalizes over-parameterized models. $\newline$ $R^2 = 1-\frac{SSE/(N-K)}{SST/(N-1)}$ $\newline$ -Adding a regressor increases $R^2$ if the associated |t| > 1. $\newline$ -No more interpretation as proportion of explained variation. $\newline$ -It is possible to test one or more joint hypotheses using an approach that is based on the loss of fit, usually under the name of F test. $\newline$ -We will consider first the F test in the case of one single coefficient and then extend it to a set of joint hypotheses. $\newline$ -Recall our MLR model on sales $\newline$ $S_i = β₁ + β2P_i + β3A_i + e_i$ $\newline$ -Assume we want to test the following hypothesis $\newline$ $H_0:\begin{cases} β_3 = 0 \end{cases}$ $\newline$ $H_1:\begin{cases} β_3 ≠ 0 \end{cases}$ $\newline$ -We know that adding a regressor causes an increase in fit, while removing a regressor causes a drop in fit. $\newline$ -The idea is to compare the unrestricted model $\newline$ $ S_i = β₁ + β2P_i + β3A_i + e_i$ $\newline$ with the restricted model $\newline$ $S_i = β₁ + β2P_i + e_i$ $\newline$ and assess how big is the loss of fit (the increase in SSE) that results from imposing the exclusion restriction on $A_i$. $\newline$ -We estimate both models (unrestricted and restricted); for each model we calculate the sum of squared residuals (SSE), and then we compare them. $\newline$ -We build de F test statistic $\newline$ $F = \frac{(SSER - SSEU)/J}{SSEU/(N-K)} = \frac{(SSER - SSEU)/J}{\hat{σ^2}}$ $\newline$ -Under the H0 $\newline$ $ F \sim F(J,N-K)$ $\newline$ -If the H0 is not true, $SSER - SSEU$ becomes large implying that the constraints placed on the model by the H0 have a large effect on the ability of the model to fit the data. $\newline$ -In our sales example we have $\newline$ $F = \frac{(SSER - SSEU)/1}{SSEU/(75 - 3)} \sim F( 1,72)$ $\newline$ -Using α = 0.05 the critical value is $F(0.95,1,72) = 3.97$ $\newline$ -The sample value of the test statistic is $\newline$ $F = \frac{(SSER - SSEU)/J}{SSEU/(N-K)} = \frac{(1718.94 - 1261.83)/1}{1718.94/72} = 52.06$ $\newline$ $pv(F) = P(F(1,72) ≥ 52.06|H_0) = 0.000$ $\newline$ -Clearly, we reject the restriction in Ho. $\newline$ -The F distribution is defined over values from 0 to infinity, so the F test is one-tailed. $\newline$ -When testing a single “equality" null hypothesis (a single restriction) against a “not equal to” alternative hypothesis, using the t-test or the F-test is equivalent. ## Testing the significance of the model - Consider the general MLR model $\newline$ $y_i = β_1 + β_2x_{i2} + β_3x_{i3} + ... + β_kx_{ik} + e_i$ $\newline$ -We can test the overall significance of the model. $\newline$ $H_o : \begin{cases} β_2 = β_3 = ... = β_k = 0 \end{cases}$ $\newline$ $H_1: \begin{cases} at least one β_k ≠ 0 for k = 2, 3, ..., K \end{cases}$ $\newline$ -This is clearly a joint hypothesis, so we can use the F test. $\newline$ -The unrestricted model is the original model $\newline$ $y_i = β_1 + β_2x_{i2} + β_3x_{i3} + ... + β_kx_{ik} + e_i$ $\newline$ -The restricted model under $H_o$ is $\newline$ $y_i = β_1 + e_i$ $\newline$ -The OLS estimate of $β_1$ in the restricted model is $b_1^* = \bar{y}$ $\newline$ -So we have $\newline$ $SSER = \sum_{i=1}^{N}(y_i - b_1^*)^2 = \sum_{i=1}^{N}(y_i - \bar{y})^2 = SST$ $\newline$ -Hence the F test statistic becomes $\newline$ $F = \frac{(SSER - SSEU)/ J}{SSEU/(N-K)} = \frac{(SST - SSE)/(K-1)}{SSE/(N-K)}$ $\newline$ which under $H_0$ $\newline$ $F \sim F(K-1, N-K)$ $\newline$ -In our example of sales $\newline$ $S_i = β₁ + β2P_i + β3A_i + e_i$ $\newline$ -The test on the overall significance $\newline$ $H_o : \begin{cases} β_2 = β_3 = 0 \end{cases}$ $\newline$ $H_1: \begin{cases} β_2 ≠ 0 or β_3 ≠ 0 or both nonzero \end{cases}$ $\newline$ $F = \frac{(SST - SSE)(3 - 1)}{SSE/(75-3)} \sim F(2,72)$ $\newline$ -The sample value of the test statistic $\newline$ $F = \frac{(3115.485 - 1718.943)/2}{1718.943/(75 - 3)} = 29.25$ $\newline$ -The critical value at 5% is 3.12 $\newline$ -Hence, we reject $H_0$ and conclude than the overall model is good to explain sales. ## Testing an extended model. -That theoretical assumption can be acomodated by a quadratic specification $\newline$ $S_i = β₁ + β2P_i + β3A_i + β₄A_i^2 + e_i$ $\newline$ -Here the marginal effect of advertising depends on the level of advertising. $\newline$ $\frac{\Delta E[S|P, A]}{\Delta A} = β_3 + 2β_4A $ (P held constant) $\newline$ -We can test the importance of advertising as a joint hypothesis $\newline$ $H_o : \begin{cases} β_3 = β_4 = 0 \end{cases}$ $\newline$ $H_1: \begin{cases} β_2 ≠ 0 or β_3 ≠ 0 or both nonzero \end{cases}$ $\newline$ -The two models are $\newline$ $S_i = β₁ + β2P_i + β3A_i + β₄A_i^ 2 + e_i $ (unrestricted model) $\newline$ $S_i = β₁ + β2P_i + e_i$ (restricted model) $\newline$ -The F test statistic is $\newline$ $F = \frac{(SSER - SSEU)/2}{SSEU/(75-4)} = 8.44$ $\newline$ -The critical value at 5% $\newline$ $F_{0.95; 2,71} = 3.126$ $\newline$ -The p-value $\newline$ $P(F_{2,71} > 8.44 | H_0) = 0.0005$ $\newline$ -We conclude that advertising has a statistically significant effect on sales revenues. ## Testing hypothesis from economic theory. -Economic theory tells us the profit-maximing firms act so that marginal benefits equal marginal costs. $\newline$ -Marginal cost of advertising= $1 (excluding te additional cost of producing the new demanded quantity). $\newline$ -Marginal benefit = $∂S/∂A = β_3 + 2β_4A$ $\newline$ -Firm's equilibrium requires $\newline$ $β_3 + 2β_4A = 1$ $\newline$ -This is a linear restriction on the coefficients, so it's testable, either using the t-test or the F-test. $\newline$ -One specific shop has spent $1,900 monthly on advertising. Is this amount consistent with the optimal choice? $\newline$ -The set of hypotheses is $\newline$ $H_0:\begin{cases} β_3 + 2\cdot β_4 \cdot 1.9 = 1 \end{cases}$ $\newline$ $H_1:\begin{cases} β_3 + 2\cdot β_4 - 1.9 ≠ 1 \end{cases}$ $\newline$ $H_0:\begin{cases} β_3 + 3.8β_4 = 1 \end{cases}$ $\newline$ $H_1:\begin{cases} β_3 + 3.8β_4 ≠ 1 \end{cases}$ $\newline$ -To test a linear restriction on the coefficients we can use the t test $\newline$ $t = \frac{b_3 + 3.8b_4 - 1}{se(b_3 + 3.8b_4)}$ $\newline$ -At the denominator we have to calculate the variance of a linear combination $\newline$ $var(b_3 + 3.8b_4) = var(b_3) + 3.8^2var(b_4) + 2\cdot 3.8 \cdot cov(b_3, b_4)$ $\newline$ $ = 12.646 + 3.8^2\cdot 0.885 - 2\cdot 3.8 \cdot 3.289$ $\newline$ $ = 0.428$ $\newline$ -The sample value of the test statistic is $\newline$ $ \frac{1.633 - 1}{0.633} = 0.968 $ $\newline$ -The 5% critical value $\newline$ $t(0.975,71) = 1.994$ $\newline$ -Since |t| < 1.994 we cannot reject H0 that $1,900 is the optimal level of advertising. $\newline$ -There is no evidence suggesting that the shop should change its advertising strategy $\newline$ -Let's test the restriction $β_3 + 2β_4A_0 = 1$ using the F test. $\newline$ -The unrestricted model $\newline$ $S_i = β₁ + β2P_i + β3A_i + β₄A_i^2 + e_i$ $\newline$ -The restricted model $\newline$ $S_i = β₁ + β2P_i + (1 − 3.8β₄)A_i + β₄A_i^2 + e_i$ $\newline$ $S_i - A_i = β₁ + β2P_i + β₄(A_i - 3.8A_i) + e_i$ $\newline$ -After running the two estimations we get $\newline$ $F = \frac{(SSER - SSEU)/J}{SSEU/(N-K)} = \frac{(1552.286 - 1532.084)/1}{1532.084/71} = 0.936$ $\newline$ -Critical value at 5% $\newline$ $F(0.95;1,71) = 3.976 = t^2 = 1.994^2$ $\newline$ $t = \sqrt{0.936} = 0.967$ $\newline$ -The two p-values $\newline$ $pv(F) = P(F(1,71) > 0.936|H_0) = 0.336$ $\newline$ $pv(t) = P(|t| > 0.967|H_0) = 0.336$ $\newline$ -Let's see the case of a one-tail test (inequality in $H_1$). $\newline$ -Assume we want to test whether the optimal value of advertising is greater than $1,900. $\newline$ -The restriction is $β_3 + 2β_4A_0 = 1$. $\newline$ -Estimates are the following $\newline$ $S_i = 109.72 - 7.64P_i + 12.15A_i - 2.77A_i^2$ $\newline$ $(6.80) \quad (1.05) \quad (3.56) \quad (0.94)$ $\newline$ -The set of hypotheses $\newline$ $H_0:\begin{cases} A_{opt} ≤ 1.9 \end{cases}$ $\newline$ $H_1:\begin{cases} A_{opt} > 1.9 \end{cases}$ $\newline$ -Knowing that $β_4 < 0$ it becomes $\newline$ $H_0:\begin{cases} β_3 + (2\cdot 1.9)β_4 ≤ 1 \end{cases}$ $\newline$ $H_1:\begin{cases} β_3 + (2\cdot 1.9)β_4 > 1 \end{cases}$ $\newline$ -Which we can rewrite as $\newline$ $H_0:\begin{cases} β_3 + (2\cdot 1.9)β_4 − 1 ≤ 0 \end{cases}$ $\newline$ $H_1:\begin{cases} β_3 + (2\cdot 1.9)β_4 − 1 > 0 \end{cases}$ $\newline$ -We cannot use the F test, so we use the t test calculated earlier. $\newline$ $t = \frac{b_3 + 3.8b_4 - 1}{se(b_3 + 3.8b_4)} = \frac{1.633 - 1}{0.654} = 0.968$ $\newline$ -The 5% critical value of the right-tail test $\newline$ $t(0.95,71) = 1.667$ $\newline$ -Since 0.9568 < 1.667 we failed to reject $H_0$. $\newline$ -Conclusion: there is not sufficient evidence that the optimal advertising level is beyond $1,900. ## Using non-sample information -Whenever we have non-sample information we are confident about we should use it because it improves estimates' precision. $\newline$ -Non-sample information enters the model in the form of restrictions on the model's parameters $\newline$ -Assume for instante the following model from economic theory, where demand for beer depends on its price, price of liquor, price of remaining goods, and income. $\newline$ $In(Q) = β₁ + β_2ln(PB) + β_3ln(PL) + β_4ln(PR) + β_5ln(I) + e$ $\newline$ -If we believe that economic agents are rational and don't suffer from money illusion, the same proportional change in all prices should leave demand unaltered $\newline$ $In(Q) = β₁ + β_2ln(APB) + β_3ln(APL) + β_4ln(APR) + β_5ln(λI) + e$ $\newline$ $= β₁ + β_2ln(PB) + β_3ln(PL) + β_4ln(PR) + β_5ln(1)$ $\newline$ $+ (β_2 + β_3 + β_4 + β_5)ln(λ) + e$ $\newline$ -The “no money illusion” assumption translates into the resitrction $\newline$ $β_2 + β_3 + β_4 + β_5 = 0 $ $\newline$ -We can impose the restriction by solving for one of the parameters in terms of the others. $\newline$ $β_4 = -β_2 - β_3 - β_5$ $\newline$ and substitute it in the model $\newline$ $In(Q) = β₁ + β_2ln(PB) + β_3ln(PL) + (−β_2 − β_3 − β_5)In(PR) + β_5ln(I) + e$ $\newline$ $= β₁ + β_2[ln(PB) – In(PR)] + β_3[ln(PL) – In(PR)] + β_5[ln(I) – ln(PR)] + e$ $\newline$ $= β₁ + β_2ln(\frac{PB}{PR})+ β_3ln(\frac{PL}{PR}) + β_5ln(\frac{I}{PR}) + e$ $\newline$ -If we run OLS estimation on this restricted model we obtain Restricted Least Squares estimators. $\newline$ -Economic theory (in the form of parameters restrictions and model specification) is an important ingredient in empirical research. -Restricted LS estimator is biased unless the restriction we impose is exactly true. $\newline$ -Imposing restrictions on parameters reduces the variance of the estimators since it reduces the estimation variability caused by random sampling. $\newline$ * Typical trade-off between variance and bias * Recall that restrictions can be tested

Lecture 7: Sales Revenue Analysis PDF

Document Details

Tags

Related

Summary

Full Transcript