Error Estimates for a Class of Elliptic Optimal Control Problems

ABSTRACT In this article, functional type a posteriori error estimates are presented for a certain class of optimal control problems with elliptic partial differential equation constraints. It is assumed that in the cost functional the state is measured in terms of the energy norm generated by the state equation. The functional a posteriori error estimates developed by Repin in the late 1990s are applied to estimate the cost function value from both sides without requiring the exact solution of the state equation. Moreover, a lower bound for the minimal cost functional value is derived. A meaningful error quantity coinciding with the gap between the cost functional values of an arbitrary admissible control and the optimal control is introduced. This error quantity can be estimated from both sides using the estimates for the cost functional value. The theoretical results are confirmed by numerical tests.


Introduction
The optimal control problem with elliptic partial di erential equation (PDE) constraint has been fully discussed in [11,Chap. 2] from the viewpoint of the existence and uniqueness of the solution, necessary conditions, di erent types of controls and observation, etc. Since the 1970s, the main development has occurred on numerical methods to generate approximate solution to the respective problem (see, e.g., [6,7,20] and references therein).
This article presents guaranteed two-sided estimates for the value of the cost functional (assuming that the state equation can not be solved exactly) and shows how they can be used to generate estimates for a certain error quantity (cf. (3.13) and Theorem 3.4). In the case of unconstrained control, some estimates and numerical tests have been in presented in [4] and in [19], the case of "box constraints" is treated. Here, these results are extended considerably for constraints of more general type, a new error quantity is introduced, and the results are con rmed by numerical tests. The functional type approach has been studied in [21], where a di erent type cost functional is studied and in [10], where similar techniques have been used for time periodic parabolic optimal control problem.
In contrast to the typical residual based error indicators associated with adaptive methods, these estimates are always valid regardless of the accuracy of the approximation itself. In particular, the estimates are valid also outside the asymptotic regime and do not contain any unknown constants.
In Section 2, de nitions and standard results related to optimal control problems with elliptic state equation are recalled. The main assumption is that cost functionals are assumed to be of a certain type, where the state is measured in terms of the energy norm generated by the state equation. This is a special case of the general theory which can be found, e.g., from monographs [6,11,20]. This assumption is not very restrictive, since in the case of less regular objective state, the cost functional can be shi ed by a projection (cf. Remark 2.1). The goal of this article is to show how the principal relations recalled in Sect. 2 generate the natural error measure and the respective functional type a posteriori error estimates.
In Section 3, the functional a posteriori error estimates (see monographs [13,16,19] and references therein) for the state equation are applied to generate two-sided bounds for the value of the cost functional. The strong connections between the functional type a posteriori error estimates and the principal relations generating the optimal control problem are underlined. This is a joint property of all functional type estimates. Since the estimates are derived on a functional level (not associated to any particular numerical method), their building blocks are the fundamental relations generating the respective problem. Theorem 3.4 (generalization of [19,Ch. 9, Th. 9.14] for the case of constrained control) is the analog of the Mikhlin identity (cf. Theorem 3.4) for the optimal control problem. The proposed error quantity di ers from the one presented in [21,Eq. (9)], where some terms are omitted and one obtains an inequality instead of identity.
Some examples of optimal control problem of the type described in Sect. 2 are discussed in Sect. 4.1. Numerical tests in Sect. 4.3 depict how the estimates can be combined with an arbitrary (conforming) numerical method. The objective of these tests is not to generate an e cient solver (Algorithm 2 possesses only a poor linear convergence), but to demonstrate that the derived error estimates satisfy the presented theoretical properties, i.e., they are indeed computable (see Algorithm 1), guaranteed, two-sided, and always valid (in contrast of asymptotic properties of error indicators).

De nitions
Let W, H, and U be Hilbert spaces. Their inner products and norms are denoted by subscripts, e.g., (·, ·) W and · W . Moreover, V ⊂ W is a Hilbert space generated by the inner product (q, z) V := (q, z) W + ( q, z) H , where : V → H is a linear, bounded operator. The injection from V to W is continuous and V is dense in W. Operator is assumed to satisfy a Friedrichs type inequality De ne linear bounded operators B : U → V * 0 , A : H → H, N : U → U, where A and N are symmetric and positive de nite, where c and c (κ and κ) are positive constants. Thus, they generate inner products and the respective norms The adjoint operators * : H → V * and B * : and Bv, where ·, · V denotes the pairing of V and its dual space V * . By the Riesz representation theorem, there exists an isomorphism (denoted, e.g., by I U : U → U * ) from any Hilbert space onto the corresponding dual space. It is relevant to de ne a subspace, where * z has additional regularity, It is V 0 -elliptic and continuous. Moreover, in V 0 , it generates an energy norm |||q||| := a(q, q).
In following examples, we demonstrate the general framework on the paradigm of the Dirichlet optimal control problem with distributed control. Here, we present the spaces and operators which generate the Poisson problem, which will be our respective state equation. The selection of spaces Together with linear right-hand side , ℓ, q := fq dx, (in general ℓ ∈ H −1 ( ), but we assume that f ∈ L 2 ( )) it de nes the weak form of the Poisson problem with homogeneous Dirichlet boundary conditions: Find ∇y · ∇q dx = fq dx, ∀q ∈ H 1 0 ( ).
The corresponding classical (strong) form of the problem is The ux of the solution, i.e., ∇y belongs to the space

Optimal control problem
The state equation is where ℓ ∈ V * 0 , v ∈ U ad ⊂ U is the control, and y(v) ∈ V 0 is the corresponding state. Let U ad ⊂ U be a non-empty, convex, and closed set. The cost functional where u d ∈ U and y d ∈ V 0 . The optimal control problem is to nd u ∈ U ad , such that Under earlier assumptions, J is U-elliptic, coercive, and lower semi-continuous. Thus, the solution of the optimal control problem exists and is unique (see, e.g., [ The Dirichlet problem is obtained if, as well as the selection of spaces and operators as de ned in Example 2.1, we select Id := B, and α := N .
Selection of positive scalar multiplication for N is typical. It leads to the Tikhonov type regularization.
Remark 2.1. In elliptic optimal control problems, the cost functional is typically of the form However, in (2.8), the objective is not to match the primal, but the dual variable of the state equation. For example, in the stationary heat (or any di usion type) problem the objective would not be to generate a desired temperature eld, but the respective ux. Similarly, in linear elasticity the objective would be to generate a certain stress eld, not displacements. Such objective elds σ d may be, e.g., measurement data. The likely lack of regularity can be easily solved as follows.
Cost functional of type can be shi ed using a projection: Find y d ∈ V 0 such that Example 2.3. The Dirichlet problem presented in Example 2.2 is not of the standard form. Typically the cost functional is of the form i.e, our aim is to match the ux of the state variable y(v) with the objective eld ∇y d . If the objective eld σ d is not of the form ∇y d (it is generated by measurements or otherwise lacks regularity), then the respective cost functional would be It can be easily shi ed to the desired form by solving (in practise numerically) the projection problem: Find y d such that Then the objective function values simply are shi ed as follows: The derivative of J at v is The necessary conditions for the optimal control problem (2.9) are (2.7) and Proposition 2.1 and (2.12) yield the well known projection condition (2.14) Requiring that u satis es also the state equation, i.e., substituting u from (2.14) to (2.7) instead of v yields a following linear problem: Find y(u) ∈ V 0 satisfying
Theorem 3.1. Let y(v) be the solution of (3.1) and z ∈ V 0 be arbitrary. Then Proof. By (2.7), Theorem 3.2. Let y(v) be the solution of (3.1) and z ∈ V 0 be arbitrary. Then and which proves the lower bound. In order to derive the upper bound, we subtract a(z, q) from both sides of (2.7) and add zero by Then, we recall (2.5) and reorganize To the rst term, we apply Cauchy-Schwartz type inequality 1 and to the second term (2.4) to obtain Next, we substitute q = y(v) − z, which yields The form (3.4) follows from squaring both sides and using Young's inequality 2 and (2.6).
Thus, the supremum over M 2 is obtained at q = y(v). Substituting τ = A y(v) to (3.4) causes the second term to vanish, since y(v) is the exact solution of the state equation, i.e., − * A y(v) + ℓ + Bv = 0.
The functionals (3.3) and (3.4) are typically used to estimate the error between the exact solution (in our special case related to the state equation) y(v) and an arbitrary conforming (approximation) z ∈ V. Note that they do not depend on the exact solution and they do not assume any additional information of a particular numerical method generating z. Instead, for example, the upper bound is valid for any τ ∈ Q, but in order for it to be accurate τ must be a good approximation of the ux A y(v). Thus, all algorithms generating approximations of the ux can be used to compute a posteriori error estimates. For a more detailed discussion of the performance and implementation of di erent algorithms, see, e.g., [13].

Estimates for the cost functional
Applying Theorem 3.2 to the rst term of (2.8) leads to two-sided bounds for J(v). Indeed, let y d play the role of the approximate solution of the state equation. Then, the rst term of (2.8) represents the "approximation error, " which can be estimated from both sides using (3.3) and (3.4). These bounds are guaranteed, have no gap, and do not depend on y(v), i.e., they do not require the solution of the state equation. and Theorem 3.3 can be used to estimate J(u). By (2.9) and (3.5), where all inequalities hold as equalities if and only if v = u, q = y(u), τ = A y(u), and β → 0. In view of (3.8), it is very important that the minimizer of J(v, q) over v ∈ U ad can be explicitly computed. Computation of the minimizers of J require further assumptions of the structure of the problem (cf. Propositions 4.1 and 4.2).
Remark 3.2. By (3.5) and (3.8), J(v, τ , β) is an upper bound of J(u) for all v ∈ U ad , τ ∈ Q, and β > 0 and J(v, q) is a lower bound for J(v) for all q ∈ U ad , but it is a lower bound of J(u) only if v =v(q) (see (3.10)).

Estimates for an error quantity
The following identity can be viewed as an analog of (3.1) for the optimal control problem. It de nes the proper error measure generated by the respective the objective functional. We use it to measure the di erence between v ∈ U ad (any admissible control) and u (the optimal control). Moreover, using earlier results we will estimate this di erence from below and above without knowing the optimal control.
Proof. We have, Then, we add and subtract 2|||y(u)||| 2 − 2a(y(u), y(v)) = 2a(y(u) − y(v), y(u)) and 2 u 2 Equality (3.12) shows that it is reasonable to include J ′ (u), v − u U to the applied error measure. Obviously, J ′ (u), v − u U is positive for any v ∈ U ad by (2.11), it is convex and vanishes if v = u. Thus, the error measure is The "derivative weight" guarantees that the sensitivity of the cost functional at the optimal control is taken into account. Most importantly, err(v) can be estimated from both sides by computable functionals, which do not require the knowledge of the optimal control u, the respective state y(u), or the exact state y(v). Indeed, applying (3.5), (3.8), and (3.9) to the right-hand side of (3.12) yields the following theorem. where and err 2 (v, τ 2 , β 2 , q 2 ) := J(v, τ 2 , β 2 ) − J(v(q 2 ), q 2 ).
Theorem 3.5 is of practical value. Assume that a er some computations (or even guessing, as the estimates do not depend on the applied method) we have obtained v ∈ U ad . Naturally, we want to nd out how close it is to the optimal control u and select (3.13) as the error measure. The right-hand side of (3.14) allows us to estimate It is important that the estimate is guaranteed, i.e., it does not matter which τ 2 , β 2 , or q 2 we select. The bound is always an upper bound, but particular choices result in sharper bounds. One alternative is to minimize err 2 (v, τ 2 , β 2 , q 2 ) numerically w.r.t. τ 2 , β 2 , or q 2 . Other options include, e.g., post-processing of v. The lower bound for the error can be treated using similar ideas. The resulting algorithms are discussed in more detail in Section 4. Remark 3.5. By Remark 3.2, (3.6), (3.7), and (3.12), the equality (3.14) is attained at err 2 (v, y(v), u, A y(u), 0) = err 2 (v) = err 2 (v, A y(v), 0, y(u)). Remark 3.6. Obviously J(v) and err 2 (v) are positive. However, e.g., the lower bound J(v(q 2 ), q 2 ) for J(u) may be negative if q 2 is not close enough to y(u) and err 2 (v, q, v 2 , τ , β) may be negative value if v 2 is not "good enough" in comparison with v or the upper bound J(v 2 , τ , β) is not "sharp enough".

Examples
In this Section, we present some optimal control problems for which the presented error methods can be used, i.e., they belong to the general class of problems de ned in Section 2.
The state equation (2.7) is and has the classical form Proposition 3.1 shows how minimizers and maximizers for J(v, q) can be computed. Analogously, the following proposition characterizes the minimizers of J(v, τ , β). These identities are of practical importance. Assume that v ∈ U ad is given. By Theorem 3.3, we can estimate the exact value of J(v) from above by J(v, τ , β), where τ ∈ H(div, ) and β > 0 can be any value. However, in order to obtain as accurate upper bound as possible, we want to identify the particular τ and β for which the upper bound is minimized. Then, we can apply numerical methods to compute approximations of minimizing τ and β (the resulting upper bound is still guaranteed upper bound and the approximation error merely reduces the accuracy of the upper bound).
The relation (3.10) becomeŝ where ad : L 2 ( ) → U ad is a projection.  Moreover, in this (unlikely) special case we are able to estimate a di erence between a given state z ∈ H 1 0 ( ) and the state related to the optimal control y(u) as follows: This result is mainly a curiosity and the proof (which closely follows the lines of Theorem 3.2, see, e.g., [19,Ch. 4.2], and [13, Ch. 3.2]) is omitted.

Example 4.3. Let
where M is a positive constant. Then the projection operator ad : else.

Neumann problem and boundary control
Here, we brie y present the analogous de nitions and results for the Neumann problem. The boundary Ŵ consists of two parts Ŵ N ∪ Ŵ D , where Ŵ D has a positive measure. By the trace theorem, there exists a bounded linear mapping γ : It has the classical form  The majorant (3.4) has the form (see, e.g., [19,Sect. 4.1] for details) where constants satisfy

Algorithms
The results of Section 3 give grounds for several error estimation algorithms. Note that the estimates in Theorems 3.3 and 3.5 are valid for any approximations from U ad . There is no need for Galerkin orthogonality, extra regularity, or mesh dependent data. Thus, they can be combined with any existing numerical scheme, which generates approximations of the optimal control (and/or state). Computation of the derived estimates requires some nite dimensional subspaces. Herea er, assume that U h ad ⊂ U ad V h 0 ⊂ V 0 and Q h ⊂ Q are given. They can be generated, e.g., by nite elements or Fourier series. The approximate solution of (2.7) is denoted by The generation of the estimates for the cost function value J(v) for a given approximation v ∈ U ad is depicted as Algorithm 1. It takes into account the fact that the approximation error of solving the state equation is encompassed into the computation J(v). As a practical "side product, " Algorithm 1 generates a lower bound for the optimal cost functional value J(u).
The rst step in Algorithm 1 is related to the computation of lower bounds for J(v), which requires a numerical solution to the state equation (4.9). This means solving a single boundary value problem numerically, e.g., by the nite element method. The second step produces a lower bound for J(u) and requires computation of the projection (3.10). The upper bound for J(v) is computed iteratively by minimizing J(v, τ , β) over τ and keeping β xed, then switching the roles and minimizing, repeating until the maximum amount of iteration steps are taken or J converges. The minimization over β can be done algebraically, but  In order to test the presented error estimates, a projected gradient method (see, e.g., [5,8]) is applied to generate a sequence approximation. The method consists of line searches along (anti)gradient directions, where all evaluated points are rst projected to the admissible set. This method was chosen for its simplicity in order to highlight the use of derived error estimates. A projected gradient method with error estimates is depicted as Algorithm 2. At the beginning of every projected gradient step, Algorithm 1 is used to generate estimates of the cost functional values of the respective approximation. A er the execution of Algorithm 2 (N iteration steps taken), cost estimates are recalled to generate two-sided estimates for err(v) (By (3.12) and (3.13) we practically estimate the ) at each iteration step (k = 1, . . . , N) as follows: . Note that the iterate of the last step (N'th step) is used to generate them most accurate bounds possible for J(u).

Numerical tests
Finite dimensional subspaces are generated by the nite element method (see, e.g., [1]). In these tests, U = L 2 ( ), V 0 = H 1 0 ( ), and Q = H(div, ). Subspaces DG p h ⊂ L 2 ( ), V p h ⊂ H 1 0 ( ), and RT p ⊂ H(div, ) are generated by discontinuous Galerkin elements, Lagrange elements, and Raviart-Thomas elements, respectively. Superscript p denotes the order of basis functions. All the numerical tests were performed using FEniCS (see [12,Ch. 3] for detailed descriptions of the applied elements and for additional references).
In Example 4.4, select k 1 = 1, k 2 = 1, m 1 = 2, m 2 = 1, β = 0.5, and α = 0.05. A mesh of 50×50 cells divided in triangular elements is being used. Considering rst linear elements, i.e., p 1 = p 2 = p 3 = 1, the amount of corresponding global degrees of freedom are dim(DG 1 h ) = 15000, dim(V 1 h ) = 2601, and dim(RT 1 h ) = 7600. The bounds generated by Algorithm 2 (I PG max = 10) are depicted in Fig. 1. In Fig. 1(a), it is shown how the estimates for the cost functional values behave during the projected gradient iteration, and in Fig. 1(b), we can observe how the error quantity J(v k )−J(u) decreases during the iteration and how the error estimates bound it from both sides.
If the order of approximation for state and ux are increased, i.e., subspaces V h and Q h are enhanced, then the error bounds improve signi cantly. Comparison of Fig. 1(b) and Fig. 2(b) clearly shows how the accuracy of error bounds improves. The approximation error is same in both gures, since the approximation accuracy (applied nite element space) for control was not changed. Here, dim(V 2 h ) = 10201 and dim(RT 2 h ) = 25200. As expected, more computational e ort (more degrees of freedom for ux and state) resulted in more accurate bounds. This methodology lets the numerical analyst balance between the accuracy of the error estimates and the computational cost. Moreover, it may  not be sensible to compute error estimates on every iteration (in our case, with the projected gradient method) step, since the computation of error estimates, as done here, requires additional numerical solutions to boundary value problems. However, this cost can be reduced by using (local) post-processing methods (see, e.g., [13]). In previous examples, J(v) and J(u) (i.e., evaluations of integrals involving the analytical solution, which provided the reference values for the exact error) were computed using a uniformly re ned mesh and 121 integration points in each triangle. Obviously, the negative lower bound for the error could be rejected immediately. A sharp lower bound requires a very good approximation of the optimal control v ≈ u and the corresponding ux of the respective state τ ≈ ∇y(u). Then, the upper bound J(u) ≤ J(v) ≤ J(v, τ , β) would be very e cient. These numerical experiments show the main property of functional type estimates; they are guaranteed regardless of the accuracy of the approximate solution. This research yields grounds for rich development, e.g., combination of the estimates with subspaces generated by isogeometric elements (see [9]) or the use of estimates as indicators to generate adaptive methods.