Saez (2010) Do Taxpayers Bunch at Kink Points PDF
Document Details
Uploaded by AppreciatedUranium
University of Bern
2010
Emmanuel Saez
Tags
Summary
This study analyzes bunching behavior of taxpayers at kink points in the US income tax schedule and the Earned Income Tax Credit (EITC). It uses tax return data to estimate compensated elasticities of reported income with respect to marginal tax rates. Results suggest clear bunching around the first EITC kink point among the self-employed, potentially due to reporting effects rather than labor supply effects.
Full Transcript
American Economic Journal: Economic Policy 2 (August 2010): 180–212 http://www.aeaweb.org/articles.php?doi=10.1257/pol.2.3.180 Do Taxpayers Bunch at Kink Points?† By Emmanuel Saez* This paper uses tax...
American Economic Journal: Economic Policy 2 (August 2010): 180–212 http://www.aeaweb.org/articles.php?doi=10.1257/pol.2.3.180 Do Taxpayers Bunch at Kink Points?† By Emmanuel Saez* This paper uses tax return data to analyze bunching at the kink points of the US income tax schedule. We estimate the compensated elas- ticity of reported income with respect to (one minus) the marginal tax rate using bunching evidence. We find clear evidence of bunch- ing around the first kink point of the Earned Income Tax Credit but concentrated solely among the self-employed. A simple tax evasion model can account for those results. We find evidence of bunching at the threshold of the first income tax bracket where tax liability starts but no evidence of bunching at any other kink point. (JEL H23, H24, H26) A large body of empirical work in labor and public economics analyzes the behavioral response of earnings to taxes and transfers using the standard static model where agents choose to supply hours of work until the marginal disutility of work equals marginal utility of disposable (net-of-tax) income.1 This model, which from now on we call the standard model, predicts that, if individual preferences are convex and smoothly distributed in the population, we should observe bunching of individuals at convex kink points of the budget set. Taxes and government trans- fers create such kink points. The progressive US individual income tax generates a piecewise linear budget set with kinks at each point where the marginal tax rate jumps. Means-tested government transfer programs also introduce piecewise-linear constraints because transfer benefits are “taxed” away as income rises. In particu- lar, the Earned Income Tax Credit (EITC) creates two large convex kink points at the points where the credit is fully phased-in, and where it starts being phased-out. Looking for bunching evidence around kinks provides a simple test of the widely used standard model and of the presence of behavioral responses to taxation along the intensive margin. Furthermore, the amount of bunching generated by budget set kinks is proportional to the size of the compensated elasticity of income with respect to the net-of-tax rate. * University of California, Department of Economics, 549 Evans Hall #3880, Berkeley, CA 94720 (e-mail: [email protected]). This paper builds upon an initial National Bureau of Economic Research working paper Saez (1999), subsequently revised for the NBER-TAPES 2002 conference but never published. I thank two anonymous reviewers, Richard Blundell, Raj Chetty, Peter Diamond, Esther Duflo, Dan Feenberg, Roger Gordon, Jonathan Gruber, Roger Guesnerie, Jerry Hausman, Jeffrey Liebman, Costas Meghir, Bruce Meyer, James Poterba, Ian Preston, Todd Sinai, and TAPES conference participants for helpful comments and discussions. Financial support from the Alfred P. Sloan Foundation and NSF Grant SES-0850631 is thankfully acknowledged. † To comment on this article in the online discussion forum, or to view additional materials, visit the articles page at http://www.aeaweb.org/articles.php?doi=10.1257/pol.2.3.180. 1 Martin Feldstein (1999) shows that this model can be extended to analyze not only the choice of hours of work, but, more generally, the response of overall income to marginal tax rates. 180 Vol. 2 No. 3 saez: do taxpayers bunch at kink points? 181 The present paper therefore has two goals. First, we investigate thoroughly whether there is evidence of bunching at the kink points of the US federal income tax—and in particular at the large kink points created by the EITC. Second, we develop an econometric method to estimate compensated elasticities of reported income with respect to net-of-tax rates using bunching evidence. Our empirical analysis uses the large annual tax return data publicly released by the Internal Revenue Service (IRS) since 1960. Those administrative data are ideally suited for the analysis because they provide information about the exact location of taxpayers on the tax schedule and, in contrast to standard survey data, have almost no measurement error. We obtain three main empirical results. First, we find clear evidence of bunching around the first kink point of the EITC— the point at which the credit reaches its maximum level—with an implied elastic- ity of earnings around 0.25. Such bunching evidence constitutes perhaps the most compelling evidence to date of behavioral responses created by the EITC along the intensive margin.2 However, we find that bunching is concentrated among EITC recipients with self-employment income with a very large implied elasticity around one. EITC recipients with only wage earnings display no evidence of bunching and thus the implied elasticity for wage earners is zero and precisely estimated. This suggests that most of the bunching response might be due to reporting effects rather than real labor supply effects.3 We develop a simple model of reporting which can account for our empirical findings. Furthermore, the amount of bunching grows over time (the EITC schedule has been stable since the major expansion of 1993–1996) perhaps because tax filers learn slowly about the EITC schedule. Second, we also find evidence of bunching at the threshold of the first tax bracket where tax liability starts, especially in the 1960s when the tax schedule was stable and very simple. The implied elasticities are also significant in that case, around 0.2. Part of the elasticity is due to the response of itemized deductions as the evidence of bunching is not as sharp for taxable income recomputed using the standard deduction (instead of the maximum of the standard deduction and itemized deductions). Third, however, we cannot find any bunching evidence for other kink points of the tax schedule, even when jumps in marginal tax rates are large and stable over many years, and even when restricting the sample to more responsive subgroups such as those reporting self-employment income. Therefore, our evidence shows that taxpayers behave as in the standard labor sup- ply model only in very specific cases. The first kink of the EITC is special because it is the level of earnings that maximizes the tax refund and should be the focal point for tax filers misreporting their incomes. The first kink point of the income tax schedule is the income level where tax liability starts, and hence might be more vis- ible on tax tables than kink points at higher income levels. Indeed, survey evidence 2 A large body of work has shown strong evidence of behavioral responses to the EITC along the extensive margin, i.e., the decision to participate in the labor force. Evidence of behavioral responses along the intensive margin (i.e., hours of work conditional on working) is weak or absent (see Nada Eissa and Hilary Hoynes 2006 and V. Joseph Hotz and John Karl Scholz 2003 for recent surveys). 3 Those results are consistent with the recent and complementary analysis by Sara Lalumia (2009) which shows that EITC expansions lead to an increase in the fraction of low-income filers with children reporting self-employment income. 182 American Economic Journal: economic policyaugust 2010 suggests that many taxpayers do not know their marginal tax rate or report it with substantial error (e.g., Edwin T. Fujii and Clifford B. Hawley 1988 for US evidence). Our analysis relates to the nonlinear budget set labor supply estimation method.4 This method was originally developed to address the endogeneity of the marginal tax rate to the labor supply choice (as higher labor supply may push the individual into a higher tax bracket). In that context, nonlinear budget sets created by the tax system were seen as a source of endogeneity problems which had to be solved using a structural model rather than an opportunity to identify behavioral responses to taxation as in our analysis of bunching. No study has carefully examined the evidence of bunching at the kink points of the US income tax schedule to uncover evidence of behavioral responses, in spite of data availability.5 A recent study by Raj Chetty et al. (2009) uses tax return data from Denmark and uncovers substantial bunching at a large kink point of the Danish income tax schedule where the top rate starts to apply. This kink point is simple and salient because it is large and the same for all individuals.6 Consistent with our US results, they do not find much evidence of bunching at smaller kink points of the Danish tax schedule. A few other studies have documented evidence of bunching at the kink points generated by other government programs. First, Burtless and Moffitt (1984) and Leora Friedberg (2000), using Current Population Survey data, observed bunching behavior in the case of elderly individuals who receive Social Security benefits but are still working and subject to the Social Security earnings test.7 Those studies, however, do not use bunching to estimate compensated elasticities, as we do here. They instead rely on the standard nonlinear budget set method for estimating behav- ioral responses. Second, Richard Blundell and Hoynes (2004) document clear evi- dence of bunching at exactly 16 hours per week for individuals likely to be eligible for the UK family credit, which imposes a 16 hour minimum working requirement. Finally, pension programs also generate kinks (or cliffs) in the lifetime budget set. As is well known, retirement hazard rates display bunching at certain ages related to the parameters of the retirement programs.8 Those studies point out that bunch- ing is evidence of behavioral responses to pension programs, although they do not directly use bunching to estimate elasticities. A recent notable exception is Kristine Brown (2007) who uses changes in the kink points due to reforms in the California First developed by Gary Burtless and Jerry Hausman (1978) to study the Negative Income Tax experiments, 4 Hausman (1981) applied the method to study the effect of the US income tax on labor supply. Robert Moffitt (1986, 1990) provides a survey of the method and its many subsequent applications. 5 US tax return data have been available for a long time, but have been rarely used by labor economists. As pointed out by Hausman (1982), in defense of the nonlinear budget set methods and in response to a criticism by James Heckman (1982), who argued that no bunching evidence could be found in the data, survey data have too much measurement error to study bunching precisely. 6 Chetty et al. (2009) argue that part of the bunching might be driven by employers’ pay policies which are tailored to avoid the top bracket, which is feasible in Denmark as the top bracket threshold is uniform across all individuals and taxes are based on individual income (as opposed to family income as in the United States). 7 Social Security benefits are taxed away (actually deferred) when earned income exceeds an exemption amount. Tax rates vary from 33 percent to 50 percent and thus generate substantial kinks in the budget set of the elderly. This phasing-out structure is simple and hence likely to be salient to social security beneficiaries. 8 For example, in the United States, there is bunching at the early retirement age of 62 (when workers become eligible to claim Social Security benefits) and bunching at the normal retirement age (see Jonathan Gruber and David A. Wise 1999 for an analysis across a number of countries). Vol. 2 No. 3 saez: do taxpayers bunch at kink points? 183 Teachers' retirement program to estimate elasticities of retirement age with respect to price incentives. In contrast to our study, Brown (2007) uses reforms and corre- sponding changes in bunching behavior to estimate elasticities while we focus on a more basic cross-sectional estimation method. The paper is organized as follows. Section I presents the conceptual framework, the data, and discusses the estimation methodology. Section II presents the EITC based results. We start with the EITC analysis because the EITC creates the largest kinks in the budget constraint, and this is precisely where the bunching evidence is most striking. Section III presents results based on the regular federal income tax schedule where bunching evidence is much weaker. Section IV concludes. I. Model, Data, and Methodology A. Standard Model and Small Kink Analysis We consider the standard model with two goods where individuals’ utility func- tions depend positively on after-tax income c (individuals value consumption) and negatively on before-tax income z (earning income requires effort). We assume that, with a linear budget set with constant marginal tax rate t, individual incomes z are distributed according to a smooth density distribution h(z). The heterogeneity in earnings z is due to differences in preferences or ability—both of which are captured by heterogeneity in the utility function u(c, z) across individuals. Suppose that a (small) kink is introduced in the budget set at income level z* by increasing the marginal tax rate from t to t + dt for incomes above z* as depicted in Figure 1 (panel A). Such a kink is going to produce bunching of individuals whose incomes were falling into a segment [z*, z* + dz*] before the kink was introduced as displayed in Figure 1 (panel B). The individual (denoted by L in Figure 1A) with earnings z* before the tax change is not affected, and his indifference curve remains tangent to the lower part of the budget set (with slope 1 − t). Let us denote by H the highest income earner (before the tax change) who is now bunching at the kink. Before the tax change, individual H had earnings z* + dz*, and his indiffer- ence curve was tangent to the linear budget with slope 1 − t as shown in Figure 1A. After the tax change, his indifference curve is exactly tangent to the upper part of the budget set (slope 1 − t − dt ) as depicted in Figure 1A. For a small change in the marginal tax rate dt, by definition of the compensated elasticity e of earnings with respect to one minus the tax rate, we have dz* = e _ (1) _ dt . z* 1−t Thus, the total number of taxpayers bunching at z* is simply h(z*)dz*, where h(z*) is the density of incomes at z* when there is no kink point, and dz* is given by equation (1). This derivation shows that bunching is proportional to the compen- sated elasticity e and to the net-of-tax ratio dt/(1 − t ). Note that in the case of large jumps (when dt/(1 − t ) is no longer small), there would be income effects and the 184 American Economic Journal: economic policyaugust 2010 Panel A. Indifference curves and bunching Individual L indifference curve Individual H indifference curves After-tax income c = z − T(z) Slope 1− t− dt Individual L chooses z* before and after reform Individual H chooses z*+ dz* before and z* after reform Slope 1−t dz*/z* = e dt/(1− t) with e compensated elasticity z* z*+ dz* Before tax income z Panel B. Density distributions and bunching Density distribution Pre-reform incomes between z* and z*+ dz* bunch at z* after reform After reform density Before reform density z* z*+ dz* Before tax income z Figure 1. Bunching Theory Notes: Panel A displays the effect on earnings choices of introducing a (small) kink in the budget set by increasing the tax rate t by dt above income level z*. Individual L who chooses z* before the reform stays at z* after the reform. Individual H chooses z* after the reform and was choosing z* + dz* before the reform. Panel B depicts the effects of introducing the kink on the earnings density distribution. The pre-reform density is smooth around z*. After the reform, all individuals with income between z* and z* + dz* before the reform, bunch at z*, creating a spike in the density dis- tribution. The density above z* + dz* shifts to z* (so that the resulting density and is no longer smooth at z*). e lasticity e would no longer be a pure compensated elasticity, but a mix of the com- pensated elasticity and the uncompensated elasticity. Four points should be noted. First, the larger the behavioral elasticity, the more bunching we should expect. Unsurprisingly, if there are no behavioral responses to marginal tax rates, there Vol. 2 No. 3 saez: do taxpayers bunch at kink points? 185 should be no bunching at all. Thus, within the standard model, this central elasticity could, in principle, be estimated by measuring the amount of bunching at kinks of the tax schedule. Second, the size of the jump in marginal tax rates is measured by the change in marginal tax rates relative to the base net-of-tax rate 1 − t. Thus, everything else being equal, a change in marginal tax rates from 0 percent to 10 percent should produce the same amount of bunching as a change from 90 percent to 91 percent. Third, our derivation assumed implicitly that all individuals had the same elas- ticity e. In the case of heterogeneous elasticities across individuals, the amount of bunching remains proportional to the average compensated elasticity at income level z*. To see this, note that our previous derivation shows that an individual with compensated elasticity e (at z*) bunches at the (small) kink if and only if she chooses earnings z ∈ [z*, z* + e z* dt/(1 − t )] under the linear tax at rate t.9 With a heteroge- neous population and the linear tax at rate t, there will be a joint distribution of earn- ings z and compensated elasticities e across individuals (as individuals with earnings z might have different elasticities), which we denote by ψ(z, e). We have h(z*) = _ ∫e ψ(z*, e)de, and we denote by e = ∫e e ψ(z*,e)de/h(z*) the average compensated elasticity at earnings level z* (again under the linear tax rate scenario). When a small kink is introduced at z*, the number of individual bunching at z* is dB = ∫e ez* (dt/ _ (1 − t )) ψ(z*, e)de = e h(z*) z* (dt/(1 − t )), which generalizes (1). Finally, note that we have considered a static model. Our results easily extend to a dynamic model, in which case bunching is proportional to the Frisch elasticity (instead of the compensated elasticity). Importantly however, if career concerns are important and current labor supply affects not only current earnings, but also earn- ings in the future (through promotions, etc.), the Frisch elasticity would be much smaller and the corresponding bunching would be smaller as well. We will come back to this point when interpreting our empirical results. B. Empirical Estimation of the Elasticity using Bunching Because actual kink points are not necessarily small, as in our previous analysis, it is useful to consider a simple parametrized model with a quasi-linear and iso- elastic utility function of the form u(c, z) = c − _ n _ z 1+1/e a b , 1 + 1/e n where n is an ability parameter distributed with density f (n) (and cumulative distri- bution F(n)) in the population (normalized to one). The quasi-linearity assumption implies that there are no income effects so that compensated and uncompensated elasticities are equal. This simplifies considerably the presentation at little cost because bunching essentially identifies the compensated elasticity as we discussed 9 This is true even if the compensated elasticity e for the individual is not constant as changes in e in the small segment [z*, z* + e z* dt/(1 − t )] would only introduce second order negligible effects. 186 American Economic Journal: economic policyaugust 2010 above. The iso-elasticity assumption implies that the elasticity is constant and equal to e, which simplifies the presentation without affecting the substance of the results. Maximization of u(c, z), subject to a linear budget constraint c = (1 − t) z + R, leads to the first order condition: 1 − t − (z/n)e = 0, which can be rewritten as (2) z = n (1 − t)e. Therefore, with no marginal tax rates (t = 0), we have z = n, so that n can be inter- preted as potential earnings. Positive tax rates depress earnings z below potential earnings n as shown in (2). Let H0(z) be the cumulative distribution of earnings when there is a con- stant marginal tax rate t0 throughout the distribution. Let us denote by h0(z) = H 0′ (z) the corresponding density distribution. We have z = n (1 − t0)e and therefore H0(z) = Pr (n (1 − t0)e ≤ z) = F(z/(1 − t0)e), and hence h0(z) = f (z/(1 − t0)e)/(1 − t0)e. Let us introduce a (convex) kink in the budget set by increasing the marginal tax rate to t1 (with t1 > t0) above earning level z*. Let us denote by h(z) the density of realized earnings and H(z) the cumulative distribution under this kinked budget set scenario. With the kink, we still have z = n (1 − t0)e below z*, i.e., for n < z*/(1 − t0)e so that h(z) = h0(z) for z < z*. However, we have z = n (1 − t1)e above z*, i.e., for n > z*/(1 − t1)e. Therefore, for z > z*, we have H(z) = Pr (n (1 − t1)e ≤ z) = F(z/(1 − t1)e), and hence h(z) = f (z/(1 − t1)e)/(1 − t1)e = h0(z ((1 − t0)/(1 − t1))e) × ((1 − t0)/(1 − t1))e. Let us denote by h(z*)− (resp. h(z*)+) the left (right) limit of h(z) when z → z*. We have h(z*)− = h0(z*) and h(z*)+ = h0(z* ((1 − t0)(1 − t1))e) × ((1 − t0)/(1 − t1))e. Individuals with n ∈ [z*/(1 − t0)e, z*/(1 − t1)e ] choose z = z* and hence bunch at the kink point. The highest ability person who bunches has n = z*/(1 − t1)e and hence had earnings z* ((1 − t0)/(1 − t1))e under the linear tax t0 scenario. As a result, any individual earning between z* and z* + Δz* under the linear tax t0 bunches at the kink under the piecewise linear tax (t0, t1), where Δz* 1 − t0 e (3) _ = a_ b − 1. z* 1 − t1 This equation generalizes equation (1) to a large kink. Therefore, the fraction of the population bunching is ∫ z*+Δz* h0(z*) + h0(z* + Δz*) (4) B = * h0(z) dz ≃ Δz* __ z 2 / 1 − t0 e h(z*)− + h(z*)+ a_ b 1 − t1 * __ = Δz , 2 Vol. 2 No. 3 saez: do taxpayers bunch at kink points? 187 where we have used the standard trapezoid approximation for the integral. Hence, combining (3) and (4) leads to a quadratic equation in ((1 − t0)/(1 − t1))e: /1−t e h(z*)− + h(z*)+ a _0 b 1 − t0 e 1 − t1 (5) B = z* ca_ b − 1d __ , 1 − t1 2 which can be solved explicitly to express e as a function of observable or empir- ically estimable variables: (1) the kink threshold z*, (2) the net-of-tax ratio (1 − t1)/(1 − t0) associated to the kink, (3) the density of the distribution just below and just above the kink h(z*)−, h(z*)+ , (4) the amount of bunching B at z*. Parameters (1) and (2) are directly observable. Therefore, we only need to estimate parameters (3) and (4), then apply the delta method to estimate e with standard errors. To estimate B, we need to evaluate how much excess density there is at the kink point z*. For a given empirical distribution h(z), we can define an income band around the kink, (z* − δ, z* + δ), and two surrounding income bands (z* − 2δ, z* − δ) and (z* + δ, z* + 2δ) below and above the kink as depicted in Figure 2. The parameter δ mea- sures the width of those income bands. The simplest estimate of excess bunching is the difference between the number of individuals in the band around the kink, and the number of individuals in the two surrounding bands: ∫ ∫ ∫ z*+δ z*−δ z*+2δ (6) B = h(z)dz − h(z)dz − h(z)dz. z*−δ z*−2δ z*+δ Many taxpayers are unable to control perfectly their incomes (due, for example, to random components such as year-end bonuses or risky returns on assets, or the dif- ficulty of exactly estimating income for tax purposes), or may not be aware of the exact location of kink points. There might also be measurement error in the data. In those cases, we would expect taxpayers to cluster around the kinks instead of bunch- ing exactly at the kink, as depicted on Figure 2 (Emmanuel Saez (1999) develops this point with a formal model of labor supply with uncertain outcomes). In that case, the choice of the parameter δ matters when estimating excess bunching B using (6). If δ is too small, the amount of excess bunching due to the behavioral response to the kink will be underestimated. However, increasing δ might also introduce bias as equation (6) ignores second-order effects due to the curvature of the underlying density function h0(z) (assuming no kink at z*). When δ is small, such curvature effects are negligible, but may be significant when δ is large. If, as depicted in Figure 2, the underlying density h0(z) is convex at z*, then for- mula (6) overestimates excess bunching as convexity implies that B from (6) is positive, even in the absence of behavioral responses. Conversely, if h0(z) is concave at z*, formula (6) underestimates excess bunching. In principle, it should be pos- sible to correct for curvature bias by estimating curvature in the density below and above the kink by using a Taylor expansion for h0 around z*, a refinement that can be 188 American Economic Journal: economic policyaugust 2010 Before reform: linear tax rate t0, density h 0 (z) After reform: tax rate t0 below z* Density distribution Tax rate t1 above z* ( t1 > t0 ), density h (z) h (z*) + B = H*− (H*− + H*+ ) = excess bunching h (z*)− h (hz()z) H*− H*+ h 0 (z) z* z*+ δ z*+2δ z*− 2δ z*− δ Before tax income z Figure 2. Estimating Excess Bunching Using Empirical Densities Notes: The figure illustrates the excess bunching estimation method using empirical densities. We assume that, under a constant linear tax with rate t0, the density of income h0(z) is smooth. A higher tax rate t1 is introduced above z*, creating a convex kink at z*. The reform will induce tax filers to cluster at z*, creating a spike in the post-reform density distribution h(z). As illustrated on the figure, bunching might not be perfectly concentrated at z* because of inability of tax filers to control or forecast their incomes perfectly or imperfect information about the exact kink location. For estimation purposes, we define three bands of income around the kink point z* using the bandwidth parameter δ. The lower band is the segment (z*−2δ, z*−δ ), it has average density h(z*)− and hence includes H*− = δ h(z*)− tax filers (dashed left area). The upper band is the segment (z* + δ, z* + 2δ ), it has average density h(z*)+ and hence includes H*+ = δ h(z*)+ tax filers (dashed right area). The middle band is the segment (z* −δ, z* + δ ) and includes H* tax filers. Excess bunching is defined as B = H* − (H*− − H*+) and is the upper dashed area on the figure. If clustering of tax filers around z* is tight, excess bunching will be estimated without bias with a small δ. If clustering is not tight around z*, a small δ will underestimate the amount of excess bunching (as the lower and upper bands will include tax filers clustering around z*). However, a large δ will lead to overestimate (under estimate) excess bunching if the before reform density h0(z) is convex (concave) around z*. implemented with larger sample size.10 As we shall see, in some cases, the elasticity estimate is sensitive to the choice of δ. The simplest method to select δ is graphical to ensure that the full excess bunching is included in the band (z* − δ, z* + δ) as in Figure 2. Empirically, h(z*)− can be estimated as the fraction of individuals in the lower surrounding band (z* − 2δ, z* − δ) divided by δ. Similarly, h(z*)+ can be esti- mated as the fraction of individuals in the upper surrounding band (z* + δ, z* + 2δ) divided by δ. We estimate the number of individuals in each of the three bands, which we denote by H ˆ *, H ˆ *−, H ˆ +* , by regressing (simultaneously) a dummy variable for belonging to each band on a constant in the sample of individuals belonging to any of those three bands. We can then compute h ˆ(z*)+ = H ˆ +* /δ, h ˆ(z*)− = H ˆ −* /δ and ˆ = H B ˆ − ( H * ˆ + + H * ˆ −) to estimate e * ˆ. 10 Chetty et al. (2009) use much larger samples in Denmark and take into account such curvature by estimating the density nonparametrically outside the bunching segment [z* − δ, z* + δ ]. Vol. 2 No. 3 saez: do taxpayers bunch at kink points? 189 We can estimate standard errors using the delta method. Alternatively, we can also compute standard errors using a bootstrap method where we draw a large num- ber N of earnings distributions according to the empirical earnings distribution h(z), estimate the corresponding elasticity e for each of the N draws, and estimate a 5 per- cent interval standard error using the distribution of estimated elasticities e across the N draws. As we shall see, because our sample size is large, the delta method and the bootstrap methods generate very similar standard errors. C. Data and Graphical Methodology As discussed at the outset of this paper, the large publicly available annual cross- sections of individual tax returns constructed by the IRS, known as the Individual Public Use Tax Files, are the ideal data to carry out this study. The data are available quasi-annually from 1960 to 2004. The number of tax returns per year is between 80,000 and 200,000. The annual cross sections are stratified random samples with higher sampling rates for high-income taxpayers or taxpayers with business income. The data include the corresponding sampling weights, and all of our estimations use those weights so as to reflect population averages. Therefore, the data span a long time and a number of different tax schedules. This is of interest because we do not expect taxpayers to adapt immediately to changes in the location of kink points, and repeated cross-sections may allow us to study the dynamics of bunching following a tax change. To detect exact bunching at kink points, the simplest method consists of produc- ing histograms of the distribution with small bins, and checking whether spikes appear at kink points. Because taxpayers may not be able to bunch perfect at kink points, we might only observe clustering or humps around kink points. In such a scenario, kernel density estimates are helpful to smooth noisy histograms and visu- ally detect excess clustering. Note that other transfer programs, or state income taxes, introduce additional kinks in the budget constraint of tax filers. However, because those programs vary geographically and do not use the same income definition as the EITC or the fed- eral taxable income, those kinks will not be located uniformly across tax filers in our samples. Hence, any bunching they generate should be smoothed out in the aggregate. II. EITC Empirical Results We start with the analysis of the EITC results because the kink points created by the EITC are largest and, as we shall see, this is where the bunching evidence is most striking and the easiest to interpret. The EITC is a transfer for low-income earners that introduces substantial kinks in the budget constraint as described in Table 1. The EITC, first introduced in 1975, was expanded after the Tax Reform Act of 1986, and then again very substantially expanded in 1993–1995, and has been quite stable since then (see Hotz and Scholz (2003) and Eissa and Hoynes (2006) for detailed descriptions of the EITC and its history). The EITC is a function of family earnings defined as the sum of wages and 190 American Economic Journal: economic policyaugust 2010 Table 1—EITC Structure and Schedules Panel A. Earned income tax credit schedule from 1988–1993 (in 2008 $)a One or more qualifying child EITC range Bracket Marginal tax rate Phase-in $0 –$11,416 −14% Plateau $11,416–$18,002 0% Phase-out $18,002–$33,985 10% Panel B. Earned income tax credit schedule from 1995–2008 (in 2008 $)b One qualifying child Two or more qualifying children EITC range Bracket Marginal tax rate Bracket Marginal tax rate Phase-in $0 –$8,580 −34% $0 –$12,060 −40% Plateau $8,580 –$15,740 0% $12,060 –$15,740 0% Phase-out $15,740 −$33,995 16% $15,740 −$38,646 21% Notes: In 1991–1993, EITC subsidy and tax rates depend on number of children (but brackets do not). a In 1991: phase-in rates are 16.7 (1 child) and 17.3 percent (2+ children). Phase-out rates are 11.9 and 12.4 percent. In 1992: phase-in rates are 17.6 (1 child) and 18.4 percent (2+ children). Phase-out rates are 12.6 and 13.1 percent. In 1993: phase-in rates are 18.5 (1 child) and 19.5 percent (2+ children). Phase-out rates are 13.2 and 13.9 percent. In 2002–2004, EITC plateau for married filers is extended by $1,000 ($2,000 in 2005–2007, and $3,000 in b 2008). In 1995, phase-in and phase-out rates were 36 and 20.2 percent (instead of 40 and 21 percent) for EITC beneficiaries with two or more children. Source: US Treasury, Internal Revenue Service, Statistics of Income: Individual Income Tax Returns (annual). salaries and self-employment income, and the number of qualifying children.11 The EITC first increases linearly with earnings in the phase-in range, is maximum in the plateau range, and then decreases linearly with earnings in the phase-out range. As shown in Table 1, since 1995, the phase-in subsidy rate is 34 percent for those with 1 child and 40 percent for those with 2 or more children. After a short plateau range where the EITC is maximum, the EITC is phased out at a rate of 16 percent (21 per- cent) for beneficiaries with one qualifying child (two or more qualifying children). Therefore, the EITC creates very large changes in marginal incentives. A. Graphical Evidence Figure 3 reports the histograms for earnings of tax filers with one child dependent (panel A) and two or more children dependents (panel B) by bins of $500. All his- tograms are presented using population weights. The graphs also depict the corre- sponding EITC schedules (as a function of earnings) in dashed lines using the right y-axis as well as the location of the kinks in vertical lines. To obtain a large sample size and smoother histograms, the figure combines all years from 1995 to 2004 and 11 Since 1994, tax filers with no children are also eligible to a modest EITC (maximum benefit of $438 in 2008) with small phase-in and phase-out rates of 7.65 percent. Because the EITC with no children is so small, we do not include it in our analysis. Vol. 2 No. 3 saez: do taxpayers bunch at kink points? 191 Panel A. One child 5,000 Density EIC Amount Earnings density ($500 bins) 4,000 EIC amount (2008 $) 3,000 2,000 1,000 0 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 Earnings (2008 $) B. Two children or more 5,000 Density EIC Amount Earnings density ($500 bins) 4,000 EIC amount ($) 3,000 2,000 1,000 0 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 Earnings (2008 $) Figure 3. Earnings Density Distributions and the EITC Notes: The figure displays the histogram of earnings (by $500 bins) for tax filers with one dependent child (panel A) and tax filers with two or more dependent children (panel B). The histogram includes all years 1995–2004 and inflates earnings to 2008 dollars using the IRS inflation parameters (so that the EITC kinks are aligned for all years). Earnings are defined as wages and salaries plus self-employment income (net of one-half of the self-employed pay- roll tax). The EITC schedule is depicted in dashed line and the three kinks are depicted with vertical lines. Panel A is based on 57,692 observations (representing 116 million tax returns), and panel B on 67,038 observations (repre- senting 115 million returns). indexes earnings to 2008 using the IRS inflation parameters, so that the EITC kinks are perfectly aligned for all years. Two elements are worth noting in Figure 3. First, there is a clear clustering of tax filers around the first kink point of the EITC. In both panels, the density is maximum exactly at the first kink point. The fact that the location of the first kink point differs between EITC recipients with one child, versus those with two or more children, con- stitutes strong evidence that the clustering is driven by behavioral responses to the EITC as predicted by the standard model. Second, however, we cannot discern any 192 American Economic Journal: economic policyaugust 2010 Panel A. One child 5,000 Wage earners Self-employed 4,000 EIC amount Earnings density EIC amount ($) 3,000 2,000 1,000 0 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 Earnings (2008 $) Panel B. Two or more children 5,000 Wage earners Self-employed 4,000 EIC amount Earnings density EIC amount ($) 3,000 2,000 1,000 0 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 Earnings (2008 $) Figure 4. Earnings Density and the EITC: Wage Earners versus Self-Employed Notes: The figure displays the kernel density of earnings for wage earners (those with no self-employment earnings) and for the self-employed (those with nonzero self employment earnings). Panel A reports the density for tax fil- ers with one dependent child and panel B for tax filers with two or more dependent children. The charts include all years 1995–2004. The bandwidth is $400 in all kernel density estimations. The fraction self-employed in 16.1 per- cent and 20.5 percent in the population depicted on panels A and B (in the data sample, the unweighted fraction self-employed is 32 percent and 40 percent). We display in dotted vertical lines around the first kink point the three bands used for the elasticity estimation with δ = $1,500. systematic clustering around the second kink point of the EITC. Similarly, we cannot discern any gap in the distribution of earnings around the concave kink point where the EITC is completely phased-out. This differential response to the first kink point, versus the other kink points, is surprising in light of the standard model predicting that any convex (concave) kink should produce bunching (gap) in the distribution of earnings. In Figure 4, we break down the sample of earners into those with nonzero self- employment income versus those zero self-employment income (and hence whose Vol. 2 No. 3 saez: do taxpayers bunch at kink points? 193 earnings comes only from wages and salaries). We now report kernel density estimates (instead of histograms) to compare densities on the same graph.12 The densities are normalized to sum to one.13 The contrast between the two groups is striking. The self- employed densities display a huge spike exactly at the first kink point, while there is no evidence of bunching at all in the sample of pure wage earners. There is no evidence of bunching at the second kink point, or of a dip at the end of the phase-out, even for the self-employed sample. Surprisingly, note that the self-employed with one qualifying child display even more bunching than those with two or more qualifying children, even though the size of the kink is larger for the former group.14 Figure 5 focuses on the self-employed and breaks down the sample into two periods, 1995–1999 (dashed line density) and 2000–2004 (solid line density). The graphs show that bunching grows dramatically from 1995 to 1999, and 2000 to 2004. The most plausible explanation is that information and knowledge on how to game the EITC with self-employment income diffuses slowly in the population. An alternative explanation could be that there are large adjustment costs to chang- ing labor supply (such as finding a new job, adjusting hours of work, etc.), which create a slow dynamic response to the EITC expansion. The information expla- nation seems more plausible because bunching growth is present only among the self-employed and never happens for wage earners (figures omitted, see elasticity estimates below). B. Elasticity Estimation Table 2 presents elasticity estimates using bunching evidence around EITC kink points for various samples. The table is organized in six columns. The first two col- umns are for the full sample including both wage earners and the self-employed. The next two columns consider the self-employed (defined, as above, as those with non- zero self-employment income). The last two columns consider the sample of wage earners (those with no self-employment income). In each of those three groups, the first column is for tax filers with one child, while the second column is for tax filers with two or more children. Panel A displays the elasticity estimates for the first, second, and third kink points (respectively) pooling all years 1995–2004 together as in Figures 3 and 4. Our results confirm the findings from the figures. We find significant elasticities around the first kink point for the full sample (0.21 and 0.15 for one child and 2+ children, respec- tively). Those significant elasticities are driven entirely by the self-employed who dis- play very large and precisely estimated elasticities (1.1 and 0.8 for one child and 2+ children, respectively). In contrast, we find insignificant elasticities very close to zero and precisely estimated for the sample of wage earners around the first kink point. We 12 The artificial drop in the kernel densities at each end of the graph is an artifact of the estimation method due to data truncation. 13 The fraction of self-employed in the population depicted in Figure 4 is 16.1 percent (for those with one child in panel A) and 20.2 percent (for those with two or more children in panel B), on average, for the years 1995–2004. In the data sample, the fractions of self-employed are higher (32 percent and 40 percent, respectively) because the data samples overweight tax filers with more complex tax returns. 14 This pattern of bunching is similar across heads of household and married tax filers. There is never bunch- ing among wage earners, while there is sharp bunching among the self-employed, but only at the first kink point. 194 American Economic Journal: economic policyaugust 2010 Panel A. One child (self-employed only) 5,000 1995–1999 2000–2004 4,000 EIC amount Earnings density EIC amount ($) 3,000 2,000 1,000 0 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 Earnings (2008 $) B. Two or more children (self-employed only) 5,000 1995–1999 4,000 Earnings density 2000–2004 EIC amount ($) EIC amount 3,000 2,000 1,000 0 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 Earnings (2008 $) Figure 5. Earnings Density and the EITC: Bunching Increases Over Time Notes: The figure displays the kernel density of earnings for the self-employed (those with nonzero self employ- ment earnings) for 1995–1999 (dotted line) and 2000–2004 (solid line). Panel A reports the density for tax filers with one dependent child and panel B for tax filers with two or more dependent children. The bandwidth is $600 in all estimations. The data sample sizes underlying those density estimates are 8,436 (1995–1999, one child), 9,883 (1995–1999, 2+ children), 12,256 (2000 –200 4, one child), 14,308 (2000 –200 4, 2+ children). The corresponding full population sizes are 8.8, 9.9, 11.2, and 12.4 million respectively. also find insignificant and precisely estimated elasticities around the second and third kink points in all samples (all workers, the self-employed, or wage earners). Panel B shows the sensitivity of our results with respect to the choice of the bandwidth δ around the first kink point.15 The bands corresponding to the baseline δ = $1,500 were depicted in Figure 4 (light dotted vertical lines). They represent a 15 The estimates around the second and third kink points are always small and insignificant and not sensitive to δ, and hence omitted to save space. Vol. 2 No. 3 saez: do taxpayers bunch at kink points? 195 Table 2—Earnings Elasticity Estimates Using EITC Bunching Evidence All filers (self-employed Self-employed Wage earners and wage earners) only only 1 child 2+ children 1 child 2+ children 1 child 2+ children Years Kink Bandwidth δ (1) (2) (3) (4) (5) (6) Panel A. Estimates around EITC kinks (1995–2004) 1995–2004 First kink $1,500 0.213 0.152 1.101 0.755 0.025 0.003 Delta method standard errors (0.033) (0.020) (0.092) (0.054) (0.036) (0.022) Bootstrap standard errors (0.037) (0.024) (0.107) (0.057) (0.039) (0.022) 1995–2004 Second kink $1,000 −0.004 −0.007 −0.018 0.016 −0.001 −0.012 (end of plateau) (0.021) (0.015) (0.040) (0.027) (0.025) (0.019) 1995–2004 Third kink $1,500 −0.019 0.003 −0.032 −0.007 −0.017 0.006 (end of EITC) (0.011) (0.006) (0.021) (0.010) (0.013) (0.008) Panel B. Sensitivity analysis with bandwidth δ 1995–2004 First kink $2,000 0.320 0.186 1.639 0.954 0.076 0.010 (0.040) (0.024) (0.116) (0.064) (0.044) (0.026) 1995–2004 First kink $1,000 0.165 0.107 0.751 0.379 0.022 0.017 (0.027) (0.016) (0.071) (0.035) (0.030) (0.019) Panel C. Elasticity grows over time 1995–1999 First kink $1,500 0.106 0.091 0.703 0.526 −0.002 −0.003 (0.044) (0.028) (0.117) (0.070) (0.049) (0.031) 2000–2004 First kink $1,500 0.327 0.211 1.424 0.937 0.056 0.008 (0.050) (0.030) (0.137) (0.081) (0.054) (0.032) Panel D. Small EITC (1988–1993) 1 child or more 1 child or more 1 child or more 1988–1993 First kink $1,000 0.092 0.207 0.071 (0.038) (0.077) (0.045) 1988–1993 Second kink $1,000 0.008 0.036 0.003 (0.029) (0.058) (0.034) 1988–1990 First kink $1,000 0.165 (0.139) 1991–1993 First kink $1,000 0.229 (0.088) Notes: This table presents estimates of the elasticity of earnings with respect to the net-of-tax rate based on bunch- ing evidence around EITC kink points as described in Section IB in the text. Standard errors are computed using the delta method. Bootstrap standard errors are also reported for the first row of estimates. The self-employed only sample (columns 3 and 4) is defined as EITC recipients with nonzero self-employment income. The wage earners only sample (columns 5 and 6) is defined as EITC recipients with zero self-employment income. In all cases, earn- ings are adjusted to 2008 dollars using the IRS inflation indexation (so that kinks are aligned across years). From 1995–2004, the EITC is large (phasing-in rates of 34 and 40 percent and maximum benefit of $2,917 and $4,824 in 2008 for tax filers with 1 child and 2+ children). From 1988 to 1993, the EITC was smaller and did not distinguish vary between 1 child versus 2+ children beneficiaries. As a result, we pool together the 1 child and 2+ children groups for those years). See Table 1 for complete EITC structure details. 196 American Economic Journal: economic policyaugust 2010 conservative choice of bandwidth leading to an underestimate of the elasticities as Figure 4 shows that the lower and upper band densities are already affected by the clustering around the kink. Therefore, it is not surprising that increasing the band- width to $2,000 leads to even higher estimates, which we view as probably closer to the true elasticity revealed by bunching behavior. Conversely, lowering the band- width to $1,000 leads to smaller (but still significant) estimates.16 This confirms that, in the presence of clustering around a kink point instead of exact bunching, the choice of the bandwidth matters. In this paper, we use a simple graphical visual approach for selecting the bandwidth which is a significant limitation. As mentioned above, with larger datasets and hence smoother density estimates, it could be pos- sible to devise a method to detect bunching humps statistically and hence choose the bandwidth δ with a systematic econometric method. Panel C breaks the sample into two periods 1995–1999 versus 2000–2004, as we did in Figure 5. The results confirm the graphical evidence and show that the elasticities more than double from the early period to the late period (again the dif- ference in elasticities across the two periods is significant in the first four columns, formal test results omitted). However, the elasticities for wage earners remain close to zero and insignificant even in the late period, suggesting that labor supply is unre- sponsive along the intensive margin even in the long run. Finally, panel D presents results for the smaller EITC for the period 1988–1993, when the phase-in subsidy rate was much smaller, as reported in Table 1, (14 percent from 1988 to 1990 and growing slowly to almost 20 percent from 1991 to 1993). The elasticity estimates are also significant for the first kink point and close to 0.3 for the self-employed. Those elasticity estimates are much lower than for the large EITC period. There is interesting heterogeneity during this “small EITC” period. The elasticity estimate for the self-employed is not significant for the early period 1988–1990 when the EITC rate was 14 percent, while it becomes significant during the years 1991, 1992, 1993 when the EITC rate is 17 percent, 18 percent, and 19 percent (on average). Graphical evidence (omitted for sake of space) shows, indeed, that there is no sharp spike at the kink point in the period of 1988–1990, but that a spike develops exactly at the kink point in the years 1991–1993.17 C. Interpretation: A Model of Tax Reporting Our first finding is that wage earners do not display any evidence of responses to the marginal incentives created by the EITC even when the change in marginal incen- tives is very large and the EITC schedule is stable (as is the case since 1995). There are several possible explanations. First, wage earners may have a very low intensive elasticity of earnings with respect to marginal tax rates. Second, wage earners may not 16 It is possible to test for equality of estimated elasticities across various bandwidth specifications by estimating the difference in elasticities simultaneously, and again using the delta method. Differences in elasticities between δ = $1,000 and δ = $2,000 are always significant for the first four rows. 17 Using one single annual cross-section of the same tax data we have used in this study, Jeffrey Liebman (1998) did not find evidence of bunching at the EITC kink points in 1992 (before the large EITC expansion). However, he did not break down recipients by self-employment status, explaining why, in contrast to our findings, he did not uncover evidence of bunching. Vol. 2 No. 3 saez: do taxpayers bunch at kink points? 197 understand the marginal incentives created by the EITC.18 Third, low-income wage earners may not have the flexibility to adjust their labor supply (as employers may impose hours constraints) or may not be able to control their earnings accurately (as work opportunities may be highly stochastic). Finally, wage earners may not be able to misreport their earnings to take advantage of the EITC because wage income is third- party reported by employers making tax evasion difficult. Our second finding is that the self-employed display significant bunching evi- dence. Consistent with the standard model, the self-employed are likely to be more responsive to marginal incentives because they may have more flexibility to adjust their labor supply or can misreport their income with less risk of being caught evad- ing.19 As shown by Feldstein (1999), such tax-avoidance behavior can easily be modeled using the standard two good utility framework introduced in Section I, as long as the costs of evasion are convex with the level of evasion. However, our bunching evidence among the self-employed remains inconsistent with the standard model in two important ways. First, bunching is found only at the first kink point and not at the second kink point (nor is there evidence of a gap in the earnings den- sity at the third concave kink point). Second, even at the first kink point, bunching arises only when the EITC subsidy rate is over 15 percent. Bunching evidence starts in 1991 when the EITC subsidy rate is above 16 percent, and becomes very large with the modern EITC with subsidy rates of 34 percent and 40 percent.20 Self-employment income reported on individual tax returns also faces the Social Security and Medicare payroll tax at a rate of 15.3 percent.21 For the self-employed, the payroll tax is administered with the individual income tax return. Therefore, the first kink of the EITC is the point that maximizes the net transfer received from the government when the EITC subsidy rate is over 15.3 percent.22 Many low-income earners do have some informal self-employment income because they perform infor- mal services for pay such as child care, cleaning, landscaping, house or car mainte- nance, etc. This informal income cannot be monitored by the IRS and is therefore, in general, not reported on tax returns (which explains the extremely large 80 percent evasion rate for informal supplier business income mentioned above). However, with an EITC subsidy rate above the payroll tax rate, it is to the advantage of low-income earners to report such informal self-employment income. As the IRS cannot monitor the amounts, it is also possible to over-report such informal self-employed income.23 18 Lack of knowledge about the structure of the EITC is indeed confirmed by surveys of low- and moderate- income families (see e.g., Lynn M. Olson and Audrey Davis 1994; Jennifer L. Romich and Thomas S. Weisner 2002). 19 Indeed, in the case of small informal business suppliers, the IRS estimates that the rate of income under- reporting is extremely high—over 80 percent for tax year 1992 (US Treasury, Internal Revenue Service, 1996, 8, table 3). 20 We have also verified that no bunching develops with the EITC for filers with no children with a subsidy rate of 7.65 percent. 21 In reality, the effective rate is slightly lower as the payroll tax applies to a base of 92.35 percent of self- employment income (to be equivalent to the sum of employer and employee payroll taxes in the case of wage earnings). However, the EITC is also based on 92.35 percent of self-employment income. Hence, the rate of 15.3 percent is the relevant number to compare to the nominal EITC subsidy rate. 22 The EITC first kink point remains the income level maximizing the net transfer received from the government even after the introduction of the refundable child tax credit in 2001. 23 Interestingly, we find that the spike around the first kink point is entirely due to tax filers reporting positive self-employment income and not at all to tax filers reporting negative self-employment income, suggesting that tax 198 American Economic Journal: economic policyaugust 2010 The following simple model of behavior can account for the facts we have uncov- ered in our empirical analysis of bunching around EITC kink points. Each individual has formal earnings w ≥ 0 (wages and salaries or formal self-employment income reported to the IRS on third party W2 or 1099 forms) and informal earnings y ≥ 0 (self-employment income not reported on third party 1099 forms). Let us assume that both w and y do not respond to taxes along the intensive margin (responses along the extensive margin would not affect the analysis and are ignored) and are smoothly distributed in the population. Individuals cannot misreport formal earnings ˆ the amount of informal earnings reported for w, but can misreport y. Let us denote y tax purposes. Taxes and transfers are based on w + y ˆ so that net disposable income ˆ). is c = w + y − T (w + y The critical element to obtain bunching solely at the first kink point global maxi- mum and not at the other kink points, as we observed in our empirical analysis, is to assume away convex preferences and, instead, use linear preferences with fixed costs. Let us therefore assume that there is an administrative fixed cost qA ˆ > 0 which represents record keeping, filing additional tax schedules, as to report y well as determining the corresponding tax liability. We further impose the constraint ˆ ≥ 0 by assuming that y that y ˆ < 0 would trigger an audit. As mentioned above, EITC bunching is not generated by those reporting negative self-employment income. ˆ ≠ y. There is also a moral fixed cost qM of misreporting y that is paid whenever y This latter cost can be a moral cost of cheating the government or can represent the risk of being audited and caught misreporting income by the IRS. Importantly, the fixed cost qM is paid whenever some evasion happens.24 It could be possible to add, on top of the fixed cost qM , a variable cost depending linearly on the amount of tax evasion without affecting the nature of the results. We omit such a variable cost to simplify the presentation.25 We assume for simplicity (and without affecting the analysis) that utility is quasi-linear in disposable income c. Therefore, an individual ˆ to maximize chooses y (7) ˆ) − qA1( y w + y − T (w + y ˆ > 0) − qM 1( y ˆ ≠ y). Let us assume that −T(z) is single peaked, and that z* is the unique reported income which maximizes the government net transfer −T(z). The US federal, state, and payroll tax system does generate such single peaked transfers. If the EITC subsidy rate is above the payroll tax rate, then z* is at the first kink point of the EITC and oth- erwise, z* = 0. Let us assume further that the distribution of costs (qA, qM) is smooth in the population. We can state the following formal proposition. filers in the phasing-out range do not avoid taxes by reporting exaggerated business losses. An explanation could be that net business losses are much more likely to trigger an IRS audit than business gains. 24 If the IRS can demonstrate that self-employment income was misreported (which is actually difficult in the case of informal business income), then the tax filer has to pay the evaded tax. As fines are rarely imposed in the case of small amounts, there is little monetary cost to cheating. Therefore, the fixed cost represents the psychologi- cal cost of going through the process of being audited by the IRS. Janet McCubbin (2000) reports that, although less than 20 percent of EITC report self-employment income, 50 percent of audited EITC returns which have income errors reported some self-employment income. 25 A convex cost of evasion as in the conventional tax evasion model would bring back convex preferences, and would generate bunching at the second kink point of the EITC, at odds with our empirical findings. Vol. 2 No. 3 saez: do taxpayers bunch at kink points? 199 Proposition 1: In our reporting model with nonconvex preferences, we have (i) The optimal report for self-reported income y ˆ can take only three values: (1) truthful reporting y = y, (2) complete evasion y ˆ ˆ = 0, or (3) maximizing ˆ = z − w. the tax refund y * (ii) If z* > 0 (i.e., the EITC subsidy rate is larger than the payroll tax rate), then an atom of tax filers will bunch at the first kink point of the EITC z*. Proof: For (i), consider first the case where z* − w ≥ 0, and suppose that y ˆ > 0 and y ˆ ≠ y, then y delivers at least as much utility as reporting z − w, hence −T (w + y ˆ * ˆ) ≥ −T (z ). As z uniquely maximizes −T (z), it must be the case that w + y * * ˆ = z*. Hence, if z − w ≥ 0, we are necessarily in scenario (1), (2), or (3). * Second, consider the case z* − w < 0 where the first kink point is not reach- ˆ ≥ 0. Because −T (z) is single peaked, y able with any y ˆ = 0 maximizes −T (w + y ˆ). ˆ ≠ y, then y Therefore, if y ˆ = 0 is the optimal choice. So, we are necessarily in sce- nario (1) or (2). For (ii), from (7), the tax filer chooses scenario (1), (2), or (3) depending on which of the three expressions, (1) w + y − T (w + y) − qA 1(y > 0), (2) w + y − T (w) − qM 1(y ≠ 0), (3) w + y − T (z*) − qA 1(z* − w > 0) − qM 1(y ≠ z* − w), is largest. Assuming that z* > 0, then all filers, such that qM < T (w + y) − T (z*) and qA < T (w) − T(z*), will bunch at the kink. Because z* is the single global maximum of −T (z), we have T (w + y) − T (z*) > 0 and T (w) − T (z*) > 0 for w ≠ z*, and w + y ≠ z*. Therefore, as qM and qA are smoothly distributed, there is a positive measure of tax filers bunching at the kink point z*. Three points are worth noting. First, if the EITC subsidy rate is smaller than the payroll tax rate, then z* = 0. Tax filers are either truthful or do not report informal earnings at all and there is no bunching at the first kink of the EITC. This represents the standard case described in Tax Compliance studies where most informal self- employment earners do not report their self-employment income. Second, when the EITC subsidy rate becomes larger than the payroll tax rate, then bunching develops at the first kink point of the EITC. Bunching tax filers are tax filers who previously did not report at all their self-employment income (for whom the gain of maximizing the net credit is larger than the administrative cost qA) or tax filers who were previously truthful (for whom the gain of maximizing the net credit is larger than the moral cost qM). Consistent with the evidence, there will be no bunch- ing at the second kink point because that point does not maximize the net tax credit. Third, the amount of bunching depends positively on the size of the net tax credit −T(z*). The increased bunching that we observed over time on Figure 5 could be modeled as a learning process whereby tax filers learn slowly over time from oth- ers or from tax preparers that reporting z* maximizes net tax transfers. Importantly, increasing the EITC at the margin generates deadweight burden in our model because tax filers who change their behavior because of the (marginal) EITC change gener- ate a first order fiscal cost but experience a second order welfare gain. Therefore, as 200 American Economic Journal: economic policyaugust 2010 in the standard model, the indirect fiscal costs due to behavioral reporting responses are deadweight burden. The size of the marginal deadweight burden is proportional to the size of the behavioral response. III. Federal Income Tax Empirical Results Last, we turn to the analysis of bunching around the kinks of the regular federal income tax schedule. We first provide background on federal income tax computa- tion, and then turn to the empirical analysis. In contrast to our previous EITC analy- sis, the results for the regular federal income tax are not as striking and it does not seem possible to provide a simple theoretical model accounting for all the empirical facts, explaining why we discuss the regular federal income tax in this separate section. A. Background on Federal Income Tax Computation For federal income tax purposes, taxable income is defined as adjusted gross income (AGI) less personal exemptions (a fixed amount per person in the tax unit) and deductions. Deductions can take the form of a standard deduction (a fixed amount depending on marital status: single, married, or head of household) or of itemized deductions whichever is larger. Itemized deductions include state and local income and property taxes, mortgage interest payments, charitable contributions, and other smaller items. Income tax is computed as a function of taxable income using a piecewise linear schedule with increasing marginal tax rates.26 The size of the tax brackets depends on marital status. The relevant income measure to study bunching around the kinks of the regular federal income tax schedule is therefore taxable income. Before the Tax Reform Act (TRA) of 1986, there was a large number of tax brackets (between 15 and 25 depending on years), and thus jumps in marginal tax rates from bracket to bracket were small—from 1 to 5 percentage points in general, except for the first bracket where tax liability begins. As shown in Table 3 (panel A), the first bracket had a tax rate between 14 percent and 20 percent depending on years. Moreover, before TRA 1986, the tax schedule was not indexed for inflation, and thus the real location of kinks changed substantially from year to year dur- ing the inflationary episodes of the 1970s—a phenomenon called “bracket creep.” The exemption and standard deduction amounts were also adjusted periodically to mitigate “bracket creep.” Thus, we limit our study of bunching in the pre-TRA era primarily to the years from 1960 to 1969, when inflation was low and the tax sched- ule stable, and only to the vicinity of the first kink point.27 As described in panel A of Table 1, the income tax structure has been remarkably stable from 1948 to 1963, with the exemption level per person fixed at $600 (in nominal dollars); the 26 There are a number of exceptions to that rule, such as favorable treatment of realized capital gains or the alternative minimum tax. 27 An earlier version of the paper, Saez (1999), analyzes in detail years 1979 to 1986 and finds no evidence of bunching or clustering, except around the first kink point. Vol. 2 No. 3 saez: do taxpayers bunch at kink points? 201 Table 3—Federal Income Tax: Structure and Schedules Panel A. Tax structure: exemptions, standard deduction, and first tax rate 1948–1963 1964–1969 1970 1971 1972–1974 1988–2002 2003–2008 A1. Personal exemptions: $600 $600 $625 $675 $750 $3,500 $3,500 (per person in (nominal) (nominal) (nominal) (nominal) (nominal) (indexed, (indexed, household) $600 in 1965 = $4,000 in 2008 2008 $) 2008 $) A2. Standard deduction: a 10% of AGI Max(0.1*AGI, $1,100 $1,050 $1,300 (indexed, (indexed, up to a max $200+$100 (nominal) (nominal) (nominal) 2008 $) 2008 $) deduction of *exemptions) $1,000 up to $1,000 Married: $9,100 $10,900 Singles: $5,450 $5,450 Heads: $8,000 $8,000 A3. First bracket tax rate: 20% 14% 14% 14% 14% b b 15% 10% (since 1954) (16% in 1964) (10% in 2002) Panel B. Tax schedules from 1988 to 2008 (in 2008 dollars) B1. 1988–1990 B2. 1991–1992 Bracket starts at: Bracket starts at: Tax rate Married Singles Heads Tax rate Married Singles Heads 15% $0 $0 $0 15% $0 $0 $0 28% $54,350 $32,550 $43,650 28% $54,350 $32,550 $43,650 33% $131,450 $78,850 $112,650 31% $131,450 $78,850 $112,650 28% $272,660 $163,300 $226,100 c d B3. 1993–2001 B4. 2002–2008 Bracket starts at: Bracket starts at: Tax rate Married Singles Heads Tax rate Married Singles Heads 10% $0 $0 $0 15% $0 $0 $0 15% $16,050 $8,025 $11,450 28% $54,350 $32,550 $43,650 25% $65,100 $32,550 $43,650 31% $131,450 $78,850 $112,650 28% $131,450 $78,850 $112,650 36% $200,300 $164,550 $182,400 33% $200,300 $164,550 $182,400 39.6% $357,700 $357,700 $357,700 35% $357,700 $357,700 $357,700 Notes: Since 1987, exemptions and standard deduction are sharply reduced for tax filers who can be claimed as dependents (typically a minor children, and students supported by their parents). Such filers are always excluded from our estimates. b The Child Tax Credit introduced in 1998 shifts the first tax bracket for tax filers with qualifying children as tax liability does not start until the child tax credit is fully phased-in. Even when the child tax credit becomes partly refundable in 2001, the first tax bracket is shifted as the child tax credit is not fully phased-in when the nominal first bracket starts. c For 2001, rates were reduced to 15, 27.5, 30.5, 35.5, 39.1 percent. For 2002, bottom tax brackets for married filers are smaller (aligned to previous years): $0, $13,980, $54,350 in 2008 dollars. d For 2002, rates were 10, 15, 27, 30, 35, 38.6 percent. Source: US Treasury, Internal Revenue Service, Statistics of Income: Individual Income Tax Returns (annual). standard deduction defined as 10 percent of AGI (up to a maximum standard deduc- tion limit of $1,000); and the first marginal tax rate equal to 20 percent. In 1964, a more advantageous standard deduction equal to $200 plus $100 times the number of 202 American Economic Journal: economic policyaugust 2010 exemptions was introduced. Finally, due to inflationary pressures, the modern stan- dard deduction was introduced in 1970 and the exemption levels increased. After TRA 1986, the number of tax brackets was drastically reduced and exemp- tions, standard deductions, tax, and EITC brackets have all been indexed to the Consumer Price Index. Table 3 (panel B) describes the tax schedule for the years 1988–2008 (expressed in 2008 dollars). The tax structure has changed relatively little from 1988 to 2001, and includes two major kinks: the first kink where the mar- ginal tax rate jumps from 0 percent to 15 percent, and the second kink with a jump from 15 percent to 28 percent. Note that two extra tax brackets have been introduced in 1993, creating jumps from 31 percent to 36 percent and 36 percent to 39.6 percent for high-income earners. In 2002, a bottom bracket with a lower rate of 10 percent was introduced so that the jump in marginal tax rate at the first kink point is from 0 percent to 10 percent. Furthermore, the upper rates have also been reduced from 2001 to 2003. Nonrefundable tax credits are items that reduce positive tax liability.28 As long as the net-of-credits tax liability remains positive, such nonrefundable tax credits have no impact on the marginal tax rate and hence the location of the kink point. However, if the nonrefundable tax credits are large enough to reduce the tax to zero, then they change the marginal tax rate from the statutory rate down to zero (as an extra dollar of income would no longer translate into a net tax increase). Therefore, the first kink point of the tax schedule starts at zero taxable income only for tax fil- ers with no tax credits. If the tax filer has nonrefundable credits equal to d, the first kink point is at taxable income z = d/τ1, where τ1 is marginal tax rate in the first tax bracket. This fact is important as the child tax credit, introduced in 1998, effectively shifts the first kink away from zero taxable income for most low-income tax filers with children, a point we explore below. B. Bunching Evidence around the First Kink Point from 1960–1972 Figure 6 displays the density distributions of taxable income, expressed in 2008 dollars and aggregating years 1960–196929 for married joint filers (panel A), and singles and heads of household (panel B). The marginal tax rate schedules are also displayed (for year 1960) in a dashed line. In all years, the first kink point is at zero as depicted by the vertical line. Both panels display visual evidence of bunching at the first kink point of the tax schedule although bunching is less pronounced in panel B. In both cases, the density peaks just before the first kink point, providing compelling evidence that the change in marginal tax rates around the first kink point produces a behavioral response of reported taxable income. A potential objection is that individuals may not systematically file tax returns in the negative range of taxable income as no tax liability is due. Figure 6, however, shows that there is no missing density just below the kink as the density is actually higher just below the kink. Indeed, in practice, withholding on wage income starts below the zero taxable 28 Nonrefundable credits cannot reduce tax liability below zero. In contrast, refundable tax credits (such as the EITC) are paid even if the tax liability falls down to zero. 29 Years 1961, 1963, and 1965 are not included because no micro file was created for these years. Vol. 2 No. 3 saez: do taxpayers bunch at kink points? 203 Panel A. Married tax filers Density 0.5 Marginal tax rate 0.4 Taxable income density Marginal tax rate 0.3 0.2 0.1 0 –20,000 –10,000 0 10,000 20,000 30,000 40,000 50,000 60,000 Taxable income (2008 $) Panel B. Single tax filers 0.5 Density Marginal tax rate 0.4 Taxable income density Marginal tax rate 0.3 0.2 0.1 0 –20,000 –10,000 0 10,000 20,000 30,000 40,000 50,000 60,000 Taxable income (2008 $) Figure 6. Taxable Income Density, 1960–1969: Bunching around First Kink Notes: The figure displays the histogram of taxable income for married joint tax filers (panel A) and single tax fil- ers defined as all nonmarried joint filers (panel B). The data include years 1960, 1962, 1964, 1966–1969 (no data are available for 1961, 1963, and 1965). Histograms are based on bins $800 wide and are computed using popu- lation weights (unweighted sample sizes for panels A and B histograms are 185,161 and 82,859 respectively). Taxable income is defined as Adjusted Gross Income minus personal exemptions minus the maximum of the stan- dard or itemized deductions, and is expressed in 2008 dollars. The marginal tax rate schedule is displayed for 1960 in dashed line. In all years, the first kink point is at zero taxable income displayed by the vertical line on the graph (other kink points move from year to year as tax brackets are not adjusted for inflation from 1960 to 1969). The first bracket marginal tax rate is 20 percent from 1960 to 1963, 16 percent in 1964, and 14 percent in 1965–1969 (see Table 3, panel A for complete details). income threshold so that most filers with low wage income and negative taxable income file to obtain a tax refund. Figure 7 casts further light on the mechanism behind the bunching uncovered on Figure 6 by plotting the kernel density of taxable income (as in Figure 6) along with the density of taxable income computed using the standard deduction, i.e., defined as 204 American Economic Journal: economic policyaugust 2010 Panel A. Married tax filers 0.5 0.4 Taxable income density Marginal tax rate 0.3 0.2 Taxable income 0.1 Standard deduction MTR 0 –20,000 –10,000 0 10,000 20,000 30,000 40,000 50,000 60,000 Taxable income (2008 $) Panel B. Single tax filers 0.5 Taxable income Standard deduction Marginal tax rate 0.4 Taxable income density MTR 0.3 0.2 0.1 0 –20,000 –10,000 0 10,000 20,000 30,000 40,000 50,000 60,000 Taxable income (2008 $) Figure 7. Taxable Income Density, 1960 –1969: Itemizing Effects on Bunching Notes: The figure displays the kernel density of income (in 2008 dollars) for married joint tax filers (panel A) and single tax filers (panel B). In each panel, the solid line is the density of actual taxable income defined as in Figure 6 as adjusted gross income minus personal exemptions minus the maximum of the standard or itemized deductions, while the dotted line is the density of taxable income computed solely with the standard deduction, i.e., defined as adjusted gross income minus personal exemptions min