The value function of an
optimization problem gives the
value attained by the
objective function at a solution, while only depending on the
parameters of the problem.[1][2] In a
controlleddynamical system, the value function represents the optimal payoff of the system over the interval [t, t1 when started at the time-tstate variablex(t)=x.[3] If the objective function represents some cost that is to be minimized, the value function can be interpreted as the cost to finish the optimal program, and is thus referred to as "cost-to-go function."[4][5] In an economic context, where the objective function usually represents
utility, the value function is conceptually equivalent to the
indirect utility function.[6][7]
In a problem of
optimal control, the value function is defined as the
supremum of the objective function taken over the set of admissible controls. Given , a typical optimal control problem is to
subject to
with initial state variable .[8] The objective function is to be maximized over all admissible controls , where is a
Lebesgue measurable function from to some prescribed arbitrary set in . The value function is then defined as
with , where is the "scrap value". If the optimal pair of control and state trajectories is , then . The function that gives the optimal control based on the current state is called a feedback control policy,[4] or simply a policy function.[9]
where the
maximand on the right-hand side can also be re-written as the
Hamiltonian, , as
with playing the role of the
costate variables.[11] Given this definition, we further have , and after differentiating both sides of the HJB equation with respect to ,
which after replacing the appropriate terms recovers the
costate equation
The value function is the unique
viscosity solution to the Hamilton–Jacobi–Bellman equation.[13] In an
online closed-loop approximate optimal control, the value function is also a
Lyapunov function that establishes global asymptotic stability of the closed-loop system.[14]
^Kamien, Morton I.; Schwartz, Nancy L. (1991). Dynamic Optimization : The Calculus of Variations and Optimal Control in Economics and Management (2nd ed.). Amsterdam: North-Holland. p. 259.
ISBN0-444-01609-0.
^Benveniste and
Scheinkman established sufficient conditions for the differentiability of the value function, which in turn allows an application of the
envelope theorem, see Benveniste, L. M.; Scheinkman, J. A. (1979). "On the Differentiability of the Value Function in Dynamic Models of Economics". Econometrica. 47 (3): 727–732.
doi:
10.2307/1910417.
JSTOR1910417. Also see Seierstad, Atle (1982). "Differentiability Properties of the Optimal Value Function in Control Theory". Journal of Economic Dynamics and Control. 4: 303–310.
doi:
10.1016/0165-1889(82)90019-7.
^Kirk, Donald E. (1970). Optimal Control Theory. Englewood Cliffs, NJ: Prentice-Hall. p. 88.
ISBN0-13-638098-0.
^Zhou, X. Y. (1990). "Maximum Principle, Dynamic Programming, and their Connection in Deterministic Control". Journal of Optimization Theory and Applications. 65 (2): 363–373.
doi:
10.1007/BF01102352.
S2CID122333807.
Clarke, Frank H.; Loewen, Philip D. (1986). "The Value Function in Optimal Control: Sensitivity, Controllability, and Time-Optimality". SIAM Journal on Control and Optimization. 24 (2): 243–263.
doi:
10.1137/0324014.