Dynamic Programming 2024 PDF

Applied Macroeconomics Dynamic Programming in Discrete Time Harald Fadinger M.Sc. Applied Economics University of Vienna 1 References These notes are based on: – O.J. Blanchar...

Applied Macroeconomics Dynamic Programming in Discrete Time Harald Fadinger M.Sc. Applied Economics University of Vienna 1 References These notes are based on: – O.J. Blanchard & S. Fischer (1989): Lectures on Macroeconomics, MIT Press – G. Femminis (2016): “From Simple Growth to Numerical Simulations: A Primer in Dynamic Programming,” Dipartimento di Economia e Finanza Working Paper n. 50, Università Cattolica del Sacro Cuore 2 Outline Introduction Recursive problems Value function Guess & verify: discrete-time Ramsey model Envelope Theorem Uncertainty Guess & verify: stochastic Ramsey model Envelope Theorem: lifetime consumption & portfolio choice Appendix (Appendices A and C are not required material.) 3 Introduction The Lagrangian and Hamiltonian approaches solve dynamic optimisation problems by finding the entire paths of the corresponding control variables. This strategy is often impractical if optimal choices depend on random state variables (uncertainty). ⇒ A complete description of optimal behaviour along the lines above may become complicated if not impossible. Dynamic programming sets up dynamic problems in a “recursive” manner: – This turns the many-period problem into one with just two components, referring to (i) today; (ii) the rest of the time horizon starting tomorrow. – Under certain conditions, the functional form of the component starting tomorrow is identical to that of the entire problem (starting today). – This simplifies the task of finding a solution. 4 Recursive problems: an example without optimisation An infinitely-lived (non-optimising) mouse is searching for some food and needs to cross a wall in order to reach it. There are three holes on the wall. If the mouse chooses to enter hole: – 1, it gets back to where it started from after 5 min. – 2, it reaches its target after 10 min and stops searching. – 3, it gets back to where it started from after 15 min. The mouse chooses to enter each hole with probability 1/3,… …and has no memory: after choosing 1 or 3 and getting back to the starting point, it immediately chooses any of the three holes with probability 1/3. How long do you expect to take the mouse to reach its goal? 5 Recursive problems: an example without optimisation 1 3 ∑ [T ] = [T | i]: expected time to goal 3 i=1 [T | i]: expected time to goal conditional on the mouse’s entering hole i. When the mouse comes back to the starting point after entering holes 1 or 3, the situation it faces ahead looks identical to the situation it faced initially. Suppose the mouse chooses to enter hole: – 1: after 5 min the mouse must start again. ⇒ [T | 1] = 5 + [T ] – 2: after 10 min the search is over. ⇒ [T | 2] = 10 – 3: after 15 min the mouse must start again. ⇒ [T | 3] = 15 + [T ] 1 2 [T ] = [(5 + [T ]) + 10 + (15 + [T ])] = 10 + [T ] ⇒ [T ] = 30 3 3 𝔼 𝔼 6 𝔼 𝔼 𝔼 𝔼 𝔼 𝔼 𝔼 𝔼 𝔼 𝔼 𝔼 Recursive problems T β t u(ct ), β ∈ (0,1), s.t. kt+1 = g(kt, ct ), k0 given ∑ max U0 = max ct ct t=0 ct: “control” variable; kt: “state” variable; k0: “initial condition” c* t = ϕ(kt ): solution or “policy function” T β t u(c* ∑ V0(k0) = t ): “(maximum) value function” t=0 Vt(kt ): maximum overall utility that can be obtained at time t when all ct are optimally chosen, given (i) initial kt and (ii) the constraint kt+1 = g(kt, ct ). T β t−1u(ct ). ∑ U0 can be rewritten as U0 = u(c0) + β t=1 T β t−1u(ct ): decision-maker’s objective function as of t = 1 ∑ t=1 7 Recursive problems T β t−1u(ct ) ∑ max s.t. kt+1 = g(kt, ct ), k1 = g(k0, c* 0 ) ct t=1 k0, c* 0 : initial condition and optimal choice for problem starting at t = 0 k1 = g(k0, c* 0 ): resulting state; initial condition of problem starting at t = 1 T β t−1u(c* ∑ V1(k1) = t ) t=1 [In general, V0(·) and V1(·) have different functional forms.] Both problems should yield the same c* 1 , c* 2 ,…, given that k1 = g(k0, c* 0 ). We can think of the decision-maker choosing c* 0 at t = 0 taking into account that, given k0, an increase in c0 affects k1, and thereby influences V1(k1). 8 Recursive problems “Principle of optimality”: an optimal policy has the property that, whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy w.r.t. the state resulting from the first decision. ⇒ Can reformulate our original problem as a “Bellman equation” (BE): – V0(k0) = u(c* 0 ) + βV1(k1), k1 = g(k0, c* 0 ) – V0(k0) = max c {u0(c0) + βV1(k1)} s.t. k1 = g(k0, c0), k0 given 0 For T → ∞, the two problems in period 0 and 1 are identical but for their initial conditions. – Now the value function only depends on k. ⇒ It has the same functional form - evaluated at different states k - on both sides of the BE. – V(k0) = u(c* 0 ) + βV(k1), k1 = g(k0, c* 0 ) – V(k0) = max c {u0(c0) + βV(k1)} s.t. k1 = g(k0, c0), k0 given 0 9 Recursive problems Why is the value function independent of time for T → ∞? – The decision-maker’s perception of the future at t = 0 and t = 1 is the same because she lives forever. – ⇒ She decides about ct based only on kt (the problem’s unique state). – Since the value function summarises the optimal choices c* t from t onward, it only depends on kt. – [With T finite, the value function’s form changes over time: t is a state.] – This result is also based on the time separability of both U and g(k, c). The value function’s independence of time (i.e., not having time as an additional state variable) makes computation easier. However, the absence of a termination clause (associated to a finite horizon) makes finding a solution conceptually more difficult. Finite horizon 10 Value function (T → ∞) V(k0) = max {u0(c0) + βV(k1)} s.t. k1 = g(k0, c0), k0 given c0 BE: “functional equation”; solving it implies determining the form of V(k). The following system of equations determines c* 0 , k1, and the form of V(k): – Assuming interior sol., u′(c* 0 ) + βV′(k1)gc(k0, c* 0 ) = 0. ⇒ c* 0 = ϕ(k0) – V(k0) = u(c* 0 ) + βV(k1) – k1 = g(k0, c* 0 ), k0 given Solution methods (applicable to problems under certainty and uncertainty): – Guessing V(k) or ϕ(k) and verifying it is a solution. – Envelope Theorem: characterisation of solution without finding V(k). – Value function iteration (usually combined with numerical methods) Value function iteration 11 Value function & transversality condition How come we do not seem to need a transversality condition (TVC) here? – Not any function is allowed as a candidate solution for the BE. – Uniqueness, for example, only follows by constraining V(·) to lie in a restricted space of functions. – This or other, related, restrictions ensure that TVC is met. Existence and uniqueness 12 Guess & verify: discrete-time Ramsey model ∞ β t ln(ct ) s.t. kt+1 = g(kt, ct ) = ktα − ct, k0 > 0 given ∑ max ct t=0 α, β ∈ (0,1); δ = 1: key simplification that allows for analytical solution. V(k0) = max {ln(c0) + βV(k1)} s.t. k1 = g(k0, c0) = k0α − c0, k0 > 0 given c 0 The following system of equations characterises the problem’s solution: −1 – 1/c* 0 − βV′(k1) = 0 ⇒ c* 0 = [βV′(k1 )] −1 – V(k0) = ln(c* 0 ) + βV(k1 ) = ln[βV′(k1 )] + βV(k1) α α −1 – k1 = k0 − c* 0 = k0 − [βV′(k1 )] , k0 given 13 Guess & verify: discrete-time Ramsey model Guess: V(kt ) = a + b ln(kt ); a and b are undetermined coefficients (unknown). – V(k0) = ln(c* 0 ) + βV(k1) ⇔ a + b ln(k0) = ln[βb/k1]−1 + β[a + b ln(k1)] α −1 βb k – 1 = k0 − (βb/k1 ) ⇒ k1 = k0α (1) 1 + βb βb 1 k1 = k0α − c* 0 = k0 α ⇒ c* 0 = k0 α (2) 1 + βb 1 + βb V(k0) = ln(c*0 ) + βV(k1) can now be rewritten as: ( 1 + βb ) [ ( 1 + βb )] 1 βb a + b ln(k0) = ln k0α + β a + b ln k0α ⇔ a + b ln(k0) = −ln(1 + βb) + α ln(k0) + βa + βb ln(βb) − βb ln(1 + βb) + αβb ln(k0) 14 Guess & verify: discrete-time Ramsey model Previous equation must hold for any k0 and any admissible values of α, β: – a = − ln(1 + βb) + βa + βb ln(βb) − βb ln(1 + βb), b = α + αβb – Solution: a = 1 − β [ln(1 − αβ) + 1 − αβ ln(αβ)], b = 1 − αβ > 0 1 αβ α Since a and b are independent of k, the guess is verified. Substituting the solution for b in (1) and (2) yields: α – c* 0 = (1 − αβ)k0 : with log utility, the saving rate is constant. 1 – k1 = αβk0α: in steady state, k = (αβ) 1 − α. – V′(k0) = b/k0 > 0: welfare increases with k0. Promising guesses about the form of V(k) are based on the form of u(c). 15 Envelope Theorem The “guess-&-verify” method is useful when a closed-form solution exists. However, only a few functional forms for payoff function u(ct ) and dynamic constraint kt+1 = g(kt, ct ) allow for a closed-form solution for V(k). Sometimes it is useful to identify the solution’s characteristics without looking for the (perhaps non-existing) analytical formulation of V(k). A result based on the Envelope Theorem is often helpful in analysing the solution for a Bellman problem (even when Vt(k) depends on time). This approach can also be applied when the value function explicitly depends on time t. Applying the Envelope Theorem takes for granted that the value function is differentiable, which may not always be the case. Envelope Theorem 16 Envelope Theorem Totally differentiate V(k0) = u(c* 0 ) + βV(k1) = u(c* 0 ) + βV[g(k0, c* 0 )]. ⇒ V′(k0)dk0 = u′(c* 0 )dc0 + βV′(k1)[gk(k0, c* 0 )dk0 + gc(k0, c* 0 )dc0] By the Envelope Theorem, this simplifies to V′(k0) = βV′(k1)gk(k0, c* 0 ). (The rest cancels out due to F.O.C. u′(c* 0 ) + βV′(k1)gc(k0, c* 0 ) = 0.) V′(k0) = βV′(k1)gk(k0, c* ) u′(c* 0) ⏟ 0 ⇒ V′(k0) = − gk(k0, c* 0 ) gc(k0, c* 0) u′(c* 0 ) + βV′(k1)gc(k0, c* 0 )=0 u′(c* 1) V′(k1) = − g (k , c*) gk(k1, c* 1 ) ⏟ c 1 1 u′(c* 1) ⇒ u′(c* 0 )=β gk(k1, c* )g (k , c*) 1 c 0 0 gc(k1, c* 1) u′(c* 0 ) + βV′(k1)gc(k0, c* 0 )=0 17 Envelope Theorem E.g., consider kt+1 = g(kt, ct ) = (1 + r)kt + w − ct, with r and w constant: – gk(k1, c* 1 )=1+r gc(k0, c* 0) ⇒ u′(c* ) = (1 + r)βu′(c* ) – g (k , c*) =1 0 1 c 1 1 Euler equation u′(c* 1) – u′(c* 0 )=β gc(k1, c* gk(k1, c*)g (k , c*) 1 c 0 0 1) The Euler equation: – can be interpreted in economic terms without referring to the unknown value function; – is often easier to solve numerically than the BE. 18 Uncertainty T [∑ ] max 0[U0] = max 0 β tu(ct ) , β ∈ (0,1), subject to: ct ct t=0 – kt+1 = g(kt, At, ct ) – A0, k0 given – At follows a stochastic process (to be characterised). 0[U0]: expected utility; t[xt+1]: mathematical expectation of xt+1 conditional on information available at time t Rational expectations: – Individuals use the mathematical expectation to evaluate their utility. – They understand the model behind the uncertainty in the economy, and make use of all the available information in making their forecasts. 𝔼 𝔼 19 𝔼 𝔼 Uncertainty When introducing random variables that depend on time, it is convenient to model them with stochastic processes that have the “Markov property”. Stochastic process zt is Markov if for every z′ and every period 0,1,2,…, t, Pr[yt+1 ≤ z′| zt, zt−1,... , z0] = Pr[zt+1 ≤ z′| zt]. The random variable’s probabilities are determined only by its most recent realisation. ⇒ The most recent realisation for the random variable is the only additional state of the system (in comparison with a non-stochastic setting). This simplifies the dynamic optimisation problem substantially. 20 Uncertainty Assume that the functional form of V(·) does not depend on t. (An example below discusses conditions under which this is possible.) V(k0, A0) = max c {u(c0) + β 0[V(k1, A1)]} s.t. constraints: 0 – k1 = g(k0, A0, c0), A0, k0 given – At follows a stochastic process. The following system of equations determines c* 0 , k1, and the form of V(k): – Assuming interior solution, u′(c* 0 )+β 0[Vk(k1, A1)gc(k1, A1, c* 0 )] = 0. – V(k0, A0) = u(c* 0 )+β 0[V(k1, A1)] – k1 = g(k0, A0, c* 0 ), A0, k0 given 21 𝔼 𝔼 𝔼 Guess & verify: stochastic Ramsey model V(k0, A0) = max {ln(c0) + β 0[V(k1, A1)]}, k0, A0 > 0 given c0 k1 = g(k0, c0) = A0k0α − c0, ln A1 = ρ ln A0 + ϵ1, α, β, ρ ∈ (0,1) ϵt: i.i.d. random variable with t−1[ϵt ] =0 ⇒ ln At: Markov process Guess V(kt, At ) = a + bk ln kt + bA ln At is verified for: – a = 1 − β [ln(1 − αβ) + 1 − αβ ln(αβ)] 1 αβ – bk = α/(1 − αβ), bA = [(1 − ρβ)(1 − αβ)]−1 α α c* 0 = (1 − αβ)A0 k0 , k1 = αβA0 k0 [Femminis (2016) discusses the solution of this problem in detail.] 22 𝔼 𝔼 Appendix A: Finite horizon T β tu(ct ), β ∈ (0,1), ∑ max s.t. kt+1 = f(kt ) − ct, k0 > 0 given ct t=0 T β tu[ f(kt ) − kt+1], k0 > 0 given. kt+1∈[0, f(kt )] ∑ Can express problem as max t=0 With finite T, the functional form of the value function need not be the same over time. We therefore index it in t: Vt(kt ). This problem can be solved by backward induction. Back 23 Appendix A: Finite horizon Consider the problem at time T, with state kT: – The optimal choice implies kT+1 = 0. – VT+1(kT+1) = 0 – VT (kT ) = k ∈[0, max f(k {u[ f (kT ) − kT+1] + βVT+1(kT+1)} = u[ f (kT )] T+1 T )] Let us move back to time T − 1, with state kT−1: – VT−1(kT−1) = k ∈[0, max f(k {u[ f (kT−1) − kT ] + βVT (kT )} T T−1)] – Here it is easy to solve for VT−1(kT−1), as VT (kT ) = u[ f (kT )] is known. – F.O.C. u′[ f (kT−1) − kT ] = βu′[ f (kT )] f′(kT ) yields kT and VT−1(kT−1) as functions of kT−1. Back 24 Appendix A: Finite horizon We can now move back to time T − 2, with state kT−2: – VT−2(kT−2) = k max {u[ f (kT−2) − kT−1] + βVT−1(kT−1)} T−1∈[0, f(kT−2 )] – And so on… This process can be iterated backward all the way to t = 0. The behaviour of the decision-maker with a long but finite horizon is similar to the behaviour of the same decision-maker with an infinite horizon. This is due to discounting and the fact that the F.O.C. of the two problems are quite similar. Back 25 Appendix B: Existence & uniqueness of value function (T → ∞) Theorem I: If (i) β ∈ (0,1), (ii) u(c) is continuous, bounded and strictly concave, and (iii) g(k, c) is concave, then – V(k) exists, and is unique and strictly concave; – ϕ(k) is continuous. The problem of Theorem I is that the boundedness of u(c) may be very restrictive. ∞ β tu(ct ) exists and is finite for any ∑ Theorem II: Assume (i) β ∈ (0,1), and (ii) t=0 feasible path kt, with t ∈ {0,1,...,∞}, given k0. Then there is a unique solution to the dynamic optimisation problem. Theorem II essentially restricts the admissible one-period payoffs to sequences that cannot grow too rapidly relatively to the discount factor. Back 26 Appendix C: Value function iteration Problems with no analytical solution are often solved with numerical value function iteration methods (beyond the scope of these notes). Here, we discuss an analytical example to illustrate this method’s essence. ∞ β t ln(ct ) kt+1 = ktα − ct, ∑ max s.t. kt ≥ 0, k0 > 0 given, α, β ∈ (0,1) ∞ {ct}0 t=0 ∞ β t ln(ktα − kt+1) kt+1 ∈ [0,ktα] ≡ Γ(kt ), ∑ V(k0) = max∞ s.t. k0 > 0 given {kt+1}0 t=0 Associated BE: V(kt ) = max {ln(ktα − y) + βV(y)} y ∈ Γ(kt ) Define “functional operator” T: (Tf )(kt ) = max {ln(ktα − y) + βf (y)}. y ∈ Γ(kt ) T has function f as its argument; upon operating on f with T, we obtain another function. Back 27 Appendix C: Value function iteration ̂ t) = α Consider initial guess V(k ln(kt ) for f. 1 − αβ { } ̂ α αβ Hence, (T V )(kt ) = y max ∈ Γ(kt ) ln(kt − y) + 1 − αβ ln(y). Maximisation of the RHS yields y = αβktα. Plugging this back, α αβ ln(αβk ) αβ ln(αβ) α ln kt ̂ α (T V )(kt ) = ln(kt − αβkt ) +α t = ln(1 − αβ) + + 1 − αβ 1 − αβ 1 − αβ y ∈ Γ(kt ) { [ ]} ̂ α αβ α T[(T V )(kt )] = max ln(kt − y) + β ln(1 − αβ) + ln(αβ) + ln y 1 − αβ 1 − αβ Applying functional operator T on the resulting expression, Back 28 Appendix C: Value function iteration Maximisation of the RHS in the previous equation yields y = αβktα again. (T V )(kt ) = T[(T V )](kt ) = (1 + β)[ln(1 − αβ) + 1 − αβ ln(αβ)] + 1 − αβ ln kt ̂2 ̂ αβ α [ ] 3 ̂ 2 αβ α (T V )(kt ) = (1 + β + β ) ln(1 − αβ) + ln(αβ) + ln kt 1 − αβ 1 − αβ n 2 n−1 1 − β 1 + β + β +... + β = 1−β [ ] n ̂ )(kt ) = n 1 − β αβ α (T V ln(1 − αβ) + ln(αβ) + ln kt 1−β 1 − αβ 1 − αβ 1−β [ ] 1 − αβ 1 αβ α lim (T V̂ )(kt ) = n ln(1 − αβ) + ln(αβ) + ln kt = V(kt ) n→∞ 1 − αβ Back 29 Appendix C: Value function iteration It is easy so show (T V )(kt ) = V(kt ): V(k) is a “fixed point” for functional operator T, which confirms our solution. T is a “contraction mapping” if operating T on any two functions f and g moves them closer together: d(Tf, Tg) ≤ d( f, g). The “distance” between two functions, d( f, g), could be the supremum point-wise gap between the two functions, for example. Contraction Mapping Theorem: if T is a contraction mapping, – T has exactly one fixed point; ̂ lim (T nV̂ ) = V. – for any V, n→∞ Blackwell’s Theorem establishes sufficient conditions for functional operators to be contraction mappings. Back 30 Appendix D: The Envelope Theorem in a nutshell ∂f (x, a) ∂2f (x, a) Consider max x f (x, a), with ∂x > 0, ∂x 2 < 0. Parameter a is given. ∂f (x*, a) F.O.C.: =0 ⇒ x* = ϕ(a), V(a) = f (x*, a) = f [ϕ(a), a] ∂x ∂f [ϕ(a), a] ∂f (x*, a) ∂f (x*, a) Envelope Theorem: V′(a) = ϕ′(a) + = ∂x ∂a ∂a = 0 by F.O.C. Changes in x* caused by changes in a do not affect the value function. The logic of this argument also applies to optimisation problems with constraints. Back 31 Food for thought (beyond this course) D. Acemoglu (2009): Introduction to Modern Economic Growth, Princeton University Press, chapter 6 L. Ljungqvist & T.J. Sargent (2018): Recursive Macroeconomic Theory, MIT Press: An up-to-date macro textbook based on dynamic programming G. Sorger (2015): Dynamic Economic Analysis. Deterministic Models in Discrete Time, Cambridge University Press N. Stokey & R.E. Lucas Jr. (1989): Recursive Methods in Economic Dynamics, Harvard University Press: A classic in dynamic programming 32

Dynamic Programming 2024 PDF

Document Details

Tags

Related

Summary

Full Transcript