Applied Public Economics PDF

Applied Public Economics 4. Optimal Labor Income Taxation Michael Gerfin University of Bern Spring 2024 Contents 1. Social Welfare Function 2. Optimal Linear Tax 3. Optimal Top Income Tax 4. A Tale of Three Elasticities 2 / 57 Overview of Optimal Taxation Realised earnings are functions of people being shown how From an efficiency perspective, government should be financed purely through lump-sum taxation With redistributional concerns, ideally levy individual-specific lump sum taxes Tax higher-ability individuals a larger lump sum Problem: cannot observe individuals’ types Therefore must tax economic outcomes such as income or consumption, which leads to distortions 3 / 57 Mirrlees (1971) Pioneer work on optimal income taxation based on structural approach Mirrlees model had a profound impact on information economics E.g. models with asymmetric information in contract theory But until late 1990s, had little impact on practical tax policy Since late 1990s, e.g. Diamond (1998), Piketty (1997), and Saez (2001) have connected Mirrlees model to practical tax policy/empirical tax studies sufficient statistic formulas in terms of labor supply/taxable income elasticities instead of primitives new approach summarized in Diamond-Saez JEP’11 and Piketty-Saez Handbook’13 4 / 57 Social Welfare Function 1. Social Welfare Function 2. Optimal Linear Tax 3. Optimal Top Income Tax 4. A Tale of Three Elasticities 5 / 57 Social Welfare Function Social Welfare Function Throughout this chapter we assume that the goal of the government is to maximize social welfare The only reason to tax is to redistribute income Transfers are negative taxes, so the sum of all taxes is zero Tax function is given by T (z), where z is taxable income Individual i has quasi-linear preferences of the form ui = u(ci , zi ) = ci − v(zi ), they could be just two different dimensions of labour supply where c = z − T (z) is after-tax income and v(·) is an increasing convex function reduce your overtime, which is Population is of size one (sum equals mean) easier 6 / 57 Social Welfare Function more influential increases in change of utility in poor people rather than rich people Social Welfare Function The social welfare function, W , is Z W = G(ui )f (wi )dw transforms individual utility into social welfare, a concave function because we want to have test for the redistribution in society. G(·) is function that transforms individual utilities into individual social welfare contributions. With redistributive tastes G(u) is increasing and concave. With quasi-linear utility we need G in order to have social preferences for redistribution f (wi ) is distribution of wages w (skills) in society (exogenous) 7 / 57 Social Welfare Function Government’s revenue constraint Government’s revenue constraint is: Z T (zi )f (wi )dw = 0 This model can also be written in sums W =N −1 0 = N −1 N X i=1 N X G(ui ) T (zi ) i=1 8 / 57 Social Welfare Function Social Welfare Weights increase the total value of society if you give an individual one USD1 Let gi = g(zi ) be the social marginal welfare weight for an individual with earnings zi in terms of public funds It is defined as gi = G′ (ui )ui,c /α α is the marginal value of one dollar tax revenue, which corresponds to the multiplier on the gov’t budget constraint in the Lagrangian for the social welfare maximization ui,c is marginal utility of consumption, equal to 1 ∀ i with quasi-linear preferences Intuitively, gi measures the dollar value to society (in terms of public funds) of increasing consumption of individual i by $1 With quasi-linear preferences gi = G′ (ui )/Ḡ′ R ′ ′ ′ α R = G (ui )f (wi )dw = Ḡ (mean of G (ui )) gi f (wi )dw = 1 (mean of gi ) 9 / 57 Social Welfare Function Properties of gi Distinguish two polar cases and an intermediate case Polar case 1: Society has no preferences for redistribution gi = 1 ∀ i Polar case 2: Society has Rawlsian preferences maximize welfare of the worst-off individuals (here zi = 0) then gi = 0 ∀ zi > 0 g1 = N/N0 ∀ zi = 0, where N0 is the number of people with zi = 0 (need to make sure gi averages to 1) Intermediate case: Society has Utilitarian preferences gi > 1 for individuals with low income and gi < 1 for individuals with high income This is illustrated in the following graph 10 / 57 Social Welfare Function Properties of gi weight lss than one 11 / 57 Social Welfare Function Derivation of α = Ḡ′ Derivation of marginal social welfare weight with quasi-linear utility: ui = ci − v(zi ) Z max L = Z [T (zi )] f (wi )dw Z Z dL ∂G(ui ) ∂u dc dT = f (wi )dw + α f (wi )dw = 0 dT ∂u ∂c dT dT Z = G′ (ui )(1)(−1) f (wi )dw + α = 0 Z α = G′ (ui ) f (w)dw = Ḡ′ T (z) G(ui )f (wi )dw + α 12 / 57 Social Welfare Function Derivation of α = Ḡ′ , discrete case Derivation of marginal social welfare weight with of quasi-linear utility: ui = ci − v(zi ) max L = X T (z) dL dT = G(ui )) + α X ∂G ( ui )∂u X (T (zi ) X dT ∂u dc +α =0 ∂c dT dT = X G′ (ui )(1)(−1) + αN = 0 = X −G′ (ui ) + αN = 0 α = X G′ (ui )/N = Ḡ′ 13 / 57 Optimal Linear Tax 1. Social Welfare Function 2. Optimal Linear Tax 3. Optimal Top Income Tax 4. A Tale of Three Elasticities 14 / 57 Optimal Linear Tax Linear Tax Rate Tax system consists of 1 2 Lump-sum grant R given to everybody (universal basic income) Marginal tax rate τ on taxable income z T (z) = τ · z − R with linear tax rate τ and lump-sum transfer R funded by taxes τ · Z , where Z denotes aggregate earnings After tax income is c = (1 − τ ) · z + R Population of size one (continuum) with quasi-linear preferences ui = u(ci , zi ) = ci − v(zi ) 15 / 57 Optimal Linear Tax Linear Tax Rate why is ν' the slope rate of the curve? it is the marginal rate of substitution between the two arguments of utility function. Ci Individual i chooses z to maximize ui = (1 − τ ) · zi + R − v(zi ) FOC is (1 − τ ) = v ′ (zi ) This implicitly defines taxable income z as a function of the net-of-tax rate (1 − τ ): z = z(1 − τ ) Individual choices of taxable income Rzi (1 − τ ) aggregate to economy wide earnings Z(1 − τ ) = zi f (wi )dw note that Z(1 − τ ) means Z is function of (1 − τ ) dU/dthing on x-c*x_i*j dzU/dthing on y-c*x_i*j slope ind, curveL -MRS= 16 / 57 Optimal Linear Tax The slope is 1, which means The reason why the red curve is less steep is that the tax effectively portion ofunit each thatremoves for each aadditional of additional dollar earned. The steeper the slope, the more of each dollar income you keep. Thus, a slope taxable z, the Linear Tax Rate of less than 1 reflects the impact of taxes taking away part of your income, income resulting in a less disposable steep line that shows the relationship between your taxable income and your disposable income after taxes. c increases by the same amount. In other words, if you make one more dollar, you get to keep that dollar. If each dollar I am earning., consumption comes up at 1-t The red line represents the situation with a tax. The slope of this line is 1−t, where t is the tax rate. If the tax rate is, for instance, 20%, then t=0.2 and the slope of the red line is 1 − 0.2 = 0.8. This means that for each additional dollar of taxable income, the disposable income only increases by 80 cents because 20 cents are paid in tax. 17 / 57 Optimal Linear Tax Revenue-Maximizing Linear Tax Rate Tax rate τ max maximizes tax revenue R = τ Z, so R′ = Z + τ max dZ/d(τ ) = Z − τ max dZ/d(1 − τ ) = 0 Solving for τ max yields τ max = 1 1+e with e = dZ 1−τ Z d(1 − τ ) e is the elasticity of aggregate earnings with respect to the net-of-tax rate (1 − τ ) τ > τ max is inefficient: cutting τ increases individuals welfare and government revenue So-called Laffer curve 18 / 57 Optimal Linear Tax Laffer Curve Principle of diminishing returns from taxation. At a 0% tax rate, the government would collect no revenue, but at a 100% tax rate, the government also would collect no revenue because there would be no incentive for people to earn taxable income. The curve suggests that there is an optimal tax rate (τ*) where tax revenue is maximized. Beyond this point, any increase in tax rates would actually lead to a decrease in total revenue. there must be a tax level that maximizes tax elasticity. he wanted to convince Reagan that taxes were too high 19 / 57 Optimal Linear Tax Optimal Linear Tax Rate Labour income i can keep + transfer Government chooses τ to maximize Utility + tasx revenues Z W = G[(1 − τ )zi + τ Z(1 − τ ) − v(zi )]f (w)dw {z } | u(ci ,zi ) Z = G[(1 − τ )zi − v(zi ) + τ Z]f (w)dw The argument of the G function can be interpreted as an analogue of the definition of social welfare in the pure efficiency analysis (1 − τ )zi − v(zi ) is private surplus τ Z is tax revenue here, tax revenue enters private utility because it is transferred back to individuals 20 / 57 Optimal Linear Tax Perturbation Argument Apply perturbation argument to derive optimal tax rate change in tax all effects start from optimal individual choices zi conditional on τ and within the G function people's reaction marginally change τ by dτ this has an mechanical effect, a behavioral effect, and a private utility effect effect on argument in the G-function the first two relate changes in tax revenue, the third is the private surplus change we know from the efficiency analysis how to compute these effects at optimal τ the social welfare weighted net effect of these changes must be zero 21 / 57 Optimal Linear Tax Effect of dτ on ui Mechanical effect on tax revenue τ Z: dM = Z × dτ Behavioral effect on tax revenue τ Z: dB = τ × dZ Utility effect on income level zi : dui = −zi × dτ Difference to pure efficiency analysis: here dM relates to average income Z, while dui relates to zi , and there is variation in zi (otherwise no need for redistribution) Utility enters social welfare through function G 22 / 57 Optimal Linear Tax dW/dτ Analyze change in W caused by dτ (leaving f (w)dw implicit) Z dW = 0 = G′i (dM + dB + dui ) Z = (G′i dM + G′i dB + G′i dui ) Z Z Z ′ ′ = Gi dM + Gi dB + G′i dui Z Z Z = dM G′i + dB G′i + G′i dui Z = dM Ḡ′ + dB Ḡ′ + G′i dui Z = dM + dB + (G′i /Ḡ′ )dui Z = dM + dB + gi dui 23 / 57 Optimal Linear Tax Condition for optimal tax rate dM + dB + R gi dui = 0 Replacing dM, dB and dui by their quantities, we get R dτ Z + τ dZ + gi (−zi dτ ) = 0 R Denote gi zi by Z g (welfare weighted average income) Then the weighted average private utility effect is dU = −Z g × dτ Putting it together: Z × dτ + τ dZ − Z g × dτ = 0 24 / 57 Optimal Linear Tax Rewrite dB Taxable income elasticity with respect to net-of-tax rate (1 − τ ) e= dZ 1−τ Z d(1 − τ ) dB dZ dτ d(1 − τ ) τ 1−τ dZ = − × × Zdτ 1−τ Z d(1 − τ ) τ ⇒ dB = − × e × Zdτ 1−τ in the first line the second equality follows from the fact dτ = −d(1 − τ ) in second line multiply by Z/Z and (1 − τ )/(1 − τ ), then rearrange to get elasticity dB = τ × dZ = −τ 25 / 57 Optimal Linear Tax Condition for optimal tax rate Using the rewritten dB we get Z × dτ − τ × e × Zdτ − Z g × dτ = 0 1−τ This simplifies to Z − Zg = τ ×e×Z 1−τ Then 1 Z − Zg 1 τ = (Z − Z g )/(e × Z) = = (1 − ḡ) 1−τ e Z e where ḡ = Z g /Z (ratio of welfare weighted average income to actual average income) 0 ≤ ḡ < 1 if g(z) is decreasing with z 26 / 57 Optimal Linear Tax Optimal Linear Tax Rate Solving for τ gives after some rearrangements 1 − ḡ τ = 1 − ḡ + e ∗ only a function of e, elasticity income, and g, equity preferences in society Formula captures the equity-efficiency trade-off robustly (τ ∗ ↓ ḡ, τ ∗ ↓ e) Without efficiency costs (e = 0), τ ∗ = 1 If society does not value redistribution at all, ḡ = 1 and τ ∗ = 0 If social preferences are Rawlsian, ḡ = 0 and τ ∗ is the revenue maximizing tax rate Formula is implicit because both e and ḡ vary with τ Formula is general as it applies to many models for the income generating process (only aggregate elasticity e matters) 27 / 57 Optimal Top Income Tax 1. Social Welfare Function 2. Optimal Linear Tax 3. Optimal Top Income Tax 4. A Tale of Three Elasticities 28 / 57 Optimal Top Income Tax Top 1% Share Over Time Source: Lampart et al. (2023), Verteilungsbericht, SGB Dossier 154 29 / 57 Optimal Top Income Tax Top 1% Share and Top Marginal Tax Rate Piketty et al (2014) Optimal Taxation of Top Labor Incomes: A Tale of Three Elasticities, American Economic Journal: Economic Policy 30 / 57 Optimal Top Income Tax Top 1% Share and Top Marginal Tax Rate Piketty et al (2014) Optimal Taxation of Top Labor Incomes: A Tale of Three Elasticities, American Economic Journal: Economic Policy 31 / 57 Optimal Top Income Tax CEO Compensation and Top Marginal Tax Rate Source: Piketty, Saez, & Stantcheva (2014), “Optimal Taxation of Top Labor Incomes: A Tale of Three Elasticities”, American Economic Journal: Economic Policy 32 / 57 Optimal Top Income Tax Growth and Change in Top Marginal Tax Rate Source: Piketty, Saez, & Stantcheva (2014), “Optimal Taxation of Top Labor Incomes: A Tale of Three Elasticities”, American Economic Journal: Economic Policy 33 / 57 Optimal Top Income Tax Optimal Top Income Tax Rate Derivation of optimal top tax rate Consider constant τ above fixed income threshold z ∗ Use perturbation argument again Assume there is a continuum of measure one of individuals above z ∗ Let z(1 − τ ) be their average income (depends on net-of-tax rate 1 − τ ), with top taxable income elasticity etop = [(1 − τ )/z] · dz/d(1 − τ ) 34 / 57 Optimal Top Income Tax Optimal Top Income Tax Rate 35 / 57 Optimal Top Income Tax Optimal Top Income Tax Rate Mechanical effect in tax revenue: dM = [z − z ∗ ]dτ Behavioral effect dz dτ d(1 − τ ) τ 1−τ dz = − · · zdτ 1−τ z d(1 − τ ) τ ⇒ dB = − · etop · zdτ 1−τ Weighted private utility effect: dB = τ · dz = −τ dU = −g top dM = −g top [z − z ∗ ]dτ where g top is the average social welfare weight of the rich 36 / 57 Optimal Top Income Tax Optimal Top Income Tax Rate Optimal τ such that dW = dM + dU + dB = 0 dW = dM + dU + dB τ top ∗ top = dτ (1 − g )(z − z ) − e z 1−τ z τ ∗ top top = dτ (z − z ) (1 − g ) − e (z − z ∗ ) 1 − τ z τ = dM 1 − g top − etop (z − z ∗ ) 1 − τ 37 / 57 Optimal Top Income Tax Optimal Top Income Tax Rate Define a = z/(z − z ∗ ) τ z top top dW = dM 1 − g − e (z − z ∗ ) 1 − τ τ top top = dM 1 − g − e a 1−τ τ top Optimal τ requires dW = 0 ⇒ 1 − g − etop a 1−τ =0 Solving for τ gives ∗ τtop = 1 − g top 1 − g top + etop a 38 / 57 Optimal Top Income Tax Optimal Top Income Tax Rate ∗ τtop = 1 − g top 1 − g top + etop a ∗ decreases with g top (redistributive tastes) τtop ∗ decreases with etop (efficiency) τtop ∗ decreases with a (thickness of top tail) τtop 39 / 57 Optimal Top Income Tax Empirical estimate of a = z/(z − z ∗ ) 40 / 57 Optimal Top Income Tax Optimal Top Income Tax Rate In US tax return data, a = z/(z − z ∗ ) very stable above z ∗ = $400K with a ≈ 1.5 (CH: a ≈ 1.9, DK: a ≈ 3) Top income distribution can be very well approximated by Pareto distribution with fixed parameter a Pareto distribution f (z) = a · k a /z 1+a with k and a constant Assuming g top = 0 and etop = 0.25 (mid-range estimate in literature) the optimal top tax rate is 1/(1 + 0.25 · 1.5) = 0.73 41 / 57 Optimal Top Income Tax Top income shares in Switzerland Source: Schaltegger and Gorgas (2011), The Evolution of Top Incomes in Switzerland 42 / 57 Optimal Top Income Tax Top income shares in Switzerland Source: Schaltegger and Gorgas (2011), The Evolution of Top Incomes in Switzerland a = β/(β − 1) 43 / 57 Optimal Top Income Tax Tax Reforms in Switzerland Source: Lampart et al. (2023), Verteilungsbericht, SGB Dossier 154 44 / 57 A Tale of Three Elasticities 1. Social Welfare Function 2. Optimal Linear Tax 3. Optimal Top Income Tax 4. A Tale of Three Elasticities 45 / 57 A Tale of Three Elasticities Piketty, Saez & Stantcheva (2014) Piketty et al (2014) analyze the channels through which the top 1% income share has increased 1 2 3 Standard supply side channel (with elasticity e1 ) Avoidance and income shifting (with elasticity e2 ) Compensation bargaining and rent seeking (with elasticity e3 ) In the most simple case e = e1 + e2 + e3 , where e is the total taxable income elasticity In the standard case, e2 = e3 = 0, and the previous result holds (assuming g top = 0) e = e1 = dz/z d(1−τ )/(1−τ ) and τ ∗ = 1/(1 + a · e1 ) Piketty et al (2014) Optimal Taxation of Top Labor Incomes: A Tale of Three Elasticities, American Economic Journal: Economic Policy 46 / 57 A Tale of Three Elasticities Tax avoidance Assume agent can shelter $x with cost f (e), and taxable income is z = y − x, where y is real income Examples for avoidance reductions in current cash compensation for e.g. deferred compensation such as stock-options outright tax evasion such as using offshore accounts z is taxed a constant rate τ , x is taxed at a constant rate 0≤t 0: pay for luck βl ≥ β: no filtering of luck component 54 / 57 A Tale of Three Elasticities Piketty et al., 2014: empirical analysis In order to measure β estimate by OLS payit = βpit + γi + µt + αXit + εit , where γi is firm FE, µt is time FE and Xit are CEO controls To decompose p regress p on luck measure l pit = πlit + γi + µt + δXit + νit , and generate p̂it = π̂lit + γ̂i + δ̂Xit , which is the part of p attributed to luck Finally, to estimate βl regress payit = βl p̂it + γi + µt + αXit + εit , The last two steps are nothing else but 2SLS with the luck measure l as instrument for performance p 55 / 57 A Tale of Three Elasticities Piketty et al., 2014: empirical analysis Performance measures: (1) net income and (2) shareholder wealth Measure of pay: Total CEO Pay Measure of luck: Mean asset-weighted performance of other firms in industry Data: Forbes 800 + Execucomp, COMPUSTAT-CRSP Years: 1970-2010 56 / 57 A Tale of Three Elasticities Piketty et al., 2014: empirical analysis there is pay for luck (CEOs are rewarded for industry wide performance) (cols. 2 and 5) pay for luck has increased since 1987 when top tax rates are lower (Panel A vs Panel B) sensitivity of workers’ wages to industry performance has not been affected by the change in top tax rates (cols. 3 and 6) 57 / 57

Applied Public Economics PDF

Document Details

Tags

Related

Summary

Full Transcript