www.oeaw.ac.at

www.ricam.oeaw.ac.at

**Robust preconditioning and** **error estimates for optimal**

**control of the**

**convection-diffusion-reaction** **equation with limited**

**observation in Isogeometric** **Analysis**

**K.-A. Mardal, J. Sogn, S. Takacs**

**RICAM-Report 2020-46**

OPTIMAL CONTROL OF THE

CONVECTION-DIFFUSION-REACTION EQUATION WITH
LIMITED OBSERVATION IN ISOGEOMETRIC ANALYSIS^{∗}

KENT-ANDRE MARDAL^{†}, JARLE SOGN^{‡}, AND STEFAN TAKACS^{§}

Abstract. In this paper we analyze an optimization problem with limited observation governed by a convection–diffusion–reaction equation. Motivated by a Schur complement approach, we arrive at continuous norms that enable analysis of well-posedness and subsequent derivation of error analysis and a preconditioner that is robust with respect to the parameters of the problem. We provide conditions for inf-sup stable discretizations and present one such discretization for box domains with constant convection. We also provide a priori error estimates for this discretization. The preconditioner requires a fourth order problem to be solved. For this reason, we use Isogeometric Analysis as a method of discretization. To efficiently realize the preconditioner, we consider geometric multigrid with a standard Gauss-Seidel smoother as well as a new macro Gauss-Seidel smoother. The latter smoother provides good results with respect to both the geometry mapping and the polynomial degree.

Key words. PDE-constrained optimization, optimal control, robust preconditioning, error estimates

AMS subject classifications. 49K20, 65F08, 65N22, 65N15

1. Introduction. In this paper, we consider an optimal control problem involv- ing a linear Convection–Diffusion–Reaction (CDR) problem, which reads as follows:

(1.1) Minimize J(u, q) := 1

2kuưu_{d}k^{2}_{L}2(O)+α

2kqk^{2}_{L}2(Ω) for u∈U, q ∈L^{2}(Ω)
subject to

ưε∆u+β· ∇u+σu=fưq in Ω, u= 0 on ∂Ω.

(1.2)

Here and in what follows, Ω is a bounded open subset ofR^{d}(d= 1,2,3) with Lipschitz
boundary, f ∈ L^{2}(Ω),ε, σ ∈ Rwith ε >0, σ ≥0, β ∈ L^{∞}(Ω)^{d} with∇ ·β = 0 and
O ⊆ Ω is measurable in R^{d}. For certain choices of the parameters, like ε β, σ,
convection–diffusion–reaction problems are singular perturbation problems exhibiting
sharp gradients and a potential for loss of regularity. To overcome the problems
associated to the loss of regularity, significant effort has been put in the development
of methods with low regularity, such as discontinuous Galerkin methods [2, 9]. We
take the opposite approach and investigate to what extent higher regularity may be
used in the setting of optimal control problems.

∗

Funding: The first author acknowledges support from the Research Council of Norway, grant 300305 and 301013. The second and the third author are supported by the Austrian Science Fund (FWF): P31048.

†Department of Mathematics, University of Oslo, Oslo, Norway. Center for Biomedical Comput- ing, Simula Research Laboratory, Lysaker, Norway ([email protected]).

‡Johann Radon Institute for Computational Mathematics (RICAM), Austrian Academy of Sci- ences, Linz, Austria ([email protected]).

§Johann Radon Institute for Computational Mathematics (RICAM), Austrian Academy of Sci- ences, Linz, Austria ([email protected]).

1

There are two main problems with (1.1)–(1.2), namely: 1) potential sharp gradi- ents leading to non-physical oscillations in the numerical solution and 2) ill-posedness due to limited observations, this is, whenOis a subset of Ω. Motivated by the fact that higher regularity has been exploited in the cases with limited observations [17,25,4], we derive order optimal preconditioners via stability analysis in non-standard Sobolev spaces.

When solving the CDR problem, it is common to consider some stabilization
method (e.g. the streamline upwind Petrov Galerkin (SUPG) method) or adaptive
grids (e.g. Shishkin grids) to reduce the oscillatory behavior, cf. [10]. This is also
the case in optimal control settings, see, e.g., [21, 3, 13,8]. We do not use any such
stabilization techniques, but we remark that the trial and test functions involved in
the state equation differ. That means, the state equation, if considered isolated, is
discretized by a Petrov-Galerkin method, although the complete optimality system is
discretized by a standard Galerkin method, this is, trial and test functions agree. In
particular, in the continuous setting, the trial space and test spaces areH^{2}(Ω)∩H_{0}^{1}(Ω)
andL^{2}(Ω), respectively, with properly weighted norms.

By considering a Schur complement of the optimal control problem, we derive non-
standard norms in which well-posedness is obtained, assuming extra regularity. From
the well-posedness of the continuous system we subsequently analyse corresponding
discrete systems to arrive at both error estimates and preconditioners that are robust
with respect to the problem parameters α, ε, β and σ. In detail, we provide a con-
dition for the discretization which ensures that the preconditioner is sparse and that
the preconditioned system is stable. Further, we give an example of such a discretiza-
tion based on Isogeometric Analysis (IgA) [14, 5]. For our approach, IgA provides
useful discretization methods since the extra regularity leads H^{2}(Ω)–conforming ap-
proximation spaces. Using these discretization methods, a priori error estimates are
derived, where we detail the dependencies of the problem parameters. We remark
that the error estimates required extending some approximation error estimates for
tensor-product B-splines, which is done in AppendixB.

Similar Schur complement preconditioners were used, on the linear algebra level, in [19,20] for optimal control problems of the CDR equation. [20] also considers mixed constraints. The preconditioners perform well for different values of the problem pa- rameters. The Schur complement preconditioners were replaced with approximations based on the factorization approach by [19]. However, this approach does not work well for problems with limited observation, i.e., whenO(Ω.

To solve the resulting linear system we use preconditioned Krylov subspace meth- ods. We consider two approaches to realize our preconditioner: sparse direct meth- ods and multigrid methods. For mid-sized problems, sparse direct solvers work well since each component of the preconditioner is symmetric and positive definite. For large-sized problems, we use a multigrid method to realize the fourth-order operator.

Combining the results from this paper and from [23, 24], it follows that the multi- grid method we consider is robust in the grid-size, however, not necessarily in any of the other problem parameters. Finding a multigrid method which is robust in the grid-size, the chosen spline degrees,α, ε, β, σ andO, remains an open problem.

The outline of the paper is as follows: in the next section we perform the analysis of the continuous problem. In Section3, we analyse the discrete problem and provide a condition for a stable discretization. IgA is then introduced in Section 4 along with the proposed discretization. In Section5, a priori error estimates are derived.

Section6 contains a discussion of the solution of an one-dimensional problem and in Section 7 we perform numerical experiments on two-dimensional problems and see

how the preconditioner behaves.

2. Analysis of the continuous problem. To obtain a standard (weak) vari-
ational formulation of the state equation (1.2), one would choose the state variable
u and test function ˜w to be in H_{0}^{1}(Ω). Instead, we consider the strong variational
formulation, this is, findu∈H^{2}(Ω)∩H_{0}^{1}(Ω) such that

(q,w)˜ L^{2}(Ω)+ (ưε∆u+β· ∇u+σu,w)˜ L^{2}(Ω)= (f,w)˜ L^{2}(Ω) ∀w˜∈L^{2}(Ω).

The Lagrangian functional associated to (1.1)–(1.2) is L(q, w, u) := 1

2kuưudk^{2}_{L}2(O)+α

2kqk^{2}_{L}2(Ω)

+ (q, w)_{L}2(Ω)+ (ưε∆u+β· ∇u+σu, w)_{L}2(Ω)ư(f, w)_{L}2(Ω),
where u ∈ H^{2}(Ω)∩H_{0}^{1}(Ω), q ∈ L^{2}(Ω) and the Lagrangian multiplier w ∈ L^{2}(Ω).

From the first order necessary optimality conditions

∂L

∂q(q, w, u) = 0, ∂L

∂w(q, w, u) = 0, ∂L

∂u(q, w, u) = 0, which are also sufficient here, we obtain the optimality system:

Problem 2.1. Find (q, w, u)∈L^{2}(Ω)×L^{2}(Ω)×H^{2}(Ω)∩H_{0}^{1}(Ω) such that
α(q,q)˜_{L}2(Ω)+ (w,q)˜_{L}2(Ω)= 0 ∀q˜∈L^{2}(Ω),
(q,w)˜ _{L}2(Ω)+ (ưε∆u+β· ∇u+σu,w)˜ _{L}2(Ω)= (f,w)˜ _{L}2(Ω) ∀w˜∈L^{2}(Ω),

(w,ưε∆˜u+β· ∇˜u+σ˜u)_{L}2(Ω)+ (u,u)˜ _{L}2(O)= (u_{d},u)˜ _{L}2(O) ∀˜u∈H^{2}(Ω)∩H_{0}^{1}(Ω).

Problem2.1can be written as

(2.1) A

q w u

=

0
M f
M˜_{O}ud

with A:=

αM M 0

M 0 K

0 K^{0} M_{O}

.

Here,M :L^{2}(Ω)→(L^{2}(Ω))^{0} represents theL^{2}(Ω)-inner product, that is, we have
hM q, wi= (q, w)_{L}2(Ω),

whereh·,·idenotes the duality product. The notation ”^{0}” is used to denote both dual
spaces and dual operators. K:H^{2}(Ω)∩H_{0}^{1}(Ω)→(L^{2}(Ω))^{0} is the state operator:

hKu, wi= (ưε∆u+β· ∇u+σu, w)L^{2}(Ω).

Finally, M_{O} : H^{2}(Ω)∩H_{0}^{1}(Ω) → (H^{2}(Ω)∩H_{0}^{1}(Ω))^{0} and ˜M_{O} : L^{2}(O) → (H^{2}(Ω)∩
H_{0}^{1}(Ω))^{0}, and both represent theL^{2}(O)-inner product on the subdomainO.

We observe that the block operatorA has a block tridiagonal form. Such tridi- agonal operators are studied in [25,4]. We use the Schur complement preconditioner proposed in [25]:

(2.2) S(A) :=

Sq 0 0 0 Sw 0

0 0 Su

,

where the components are

(2.3) Sq:=αM, Sw:= 1

αM, Su:=MO+αK^{0}M^{−1}K.

These Schur complements define weighted norms as follows:

kqk^{2}_{S}_{q} :=hSqq, qi=αkqk^{2}_{L}2(Ω),
kwk^{2}_{S}

w :=hS_{w}w, wi= 1

αkwk^{2}_{L}2(Ω),

kuk^{2}_{S}_{u} :=hSuu, ui=kuk^{2}_{L}2(O)+αk −ε∆u+β· ∇u+σuk^{2}_{L}2(Ω).
(2.4)

The last norm follows from
K^{0}M^{−1}Ku, u

= sup

06=w∈L^{2}(Ω)

hKu, wi^{2}

hM w, wi = sup

06=w∈L^{2}(Ω)

(−ε∆u+β· ∇u+σu, w)^{2}_{L}2(Ω)

kwk^{2}_{L}2(Ω)

= k −ε∆u+β· ∇u+σuk^{4}_{L}2(Ω)

k −ε∆u+β· ∇u+σuk^{2}_{L}_{2}_{(Ω)} =k −ε∆u+β· ∇u+σuk^{2}_{L}2(Ω).
We show well-posedness with respect to the norms (2.4) by showing that the operator
(2.5) A:L^{2}(Ω)×L^{2}(Ω)×H^{2}(Ω)∩H_{0}^{1}(Ω)→L^{2}(Ω)^{0}×L^{2}(Ω)^{0}×(H^{2}(Ω)∩H_{0}^{1}(Ω))^{0}
is an isomorphism with respect to the norms (2.4). This is done by using the main
result in [25], which for our problem reads as follows.

Theorem 2.2. Assume the Schur complements in (2.3) are well defined and pos- itive definite, that is,

(2.6) hSqq, qi ≥σ_{q}kqk^{2}_{L}2(Ω), hSww, wi ≥σ_{w}kwk^{2}_{L}2(Ω), hSuu, ui ≥σ_{u}kuk^{2}_{H}2(Ω)

for some positive constants σq, σw and σu, which can depend on the given param-
eters. Then, A in (2.5) is an isomorphism, moreover, the condition number of the
preconditioned operatorS^{−1}A is bounded:

(2.7) κ S^{−1}A

≤ cos(π/7)

sin(π/14) ≈4.05.

The conditions in (2.6) ensure that the spaces L^{2}(Ω), L^{2}(Ω) and H^{2}(Ω)∩H_{0}^{1}(Ω),
equipped with normsk·kS_{q},k·kS_{w}andk·kS_{u}, are complete. Before proving Condition
(2.6), we provide a useful lemma which bounds theH^{2}-norm. The proof of this lemma
is presented in AppendixA.

Lemma 2.3. If the domainΩ has a Lipschitz boundary and

• the boundary is a polygon (polyhedron) or

• the domain is the image of a geometric mappingG:Ω := (0,b 1)^{d}→Ω, where
both k∇^{r}Gk_{L}∞(bΩ) andk(∇^{r}G)^{−1}k_{L}∞(bΩ) are bounded forr∈ {1,2,3},
then theH^{2}-norm is bounded by the L^{2}-norm of the Laplacian, i.e.,

(2.8) kukH^{2}(Ω)≤CΩk∆ukL^{2}(Ω) ∀u∈H^{2}(Ω)∩H_{0}^{1}(Ω),
for a constantCΩ depending only onΩ.

Theorem 2.4. IfΩis a domain such that the conditions of Lemma2.3hold, then the assumptions of Theorem2.2are satisfied for Problem 2.1.

Proof. For simplicity, we prove this lemma only for σ= 0. An extension to the caseσ >0 is straight-forward.

The first two conditions are trivial sincehSqq, qi=αkqk^{2}_{L}2(Ω) and
hSww, wi= _{α}^{1}kwk^{2}_{L}2(Ω). For the third condition, let

δ:= kβk

ε

c_{P} +kβk,

where kβk=kβk_{L}^{∞}_{(Ω)} andcP is constant from the Poincar´e inequality, that is, we
have

kuk_{L}2(Ω)≤cpk∇uk_{L}2(Ω).

Note thatδ∈[0,1). Letu∈H^{2}(Ω)∩H_{0}^{1}(Ω) be arbitrary but fixed. We consider the
two cases:

(2.9) kβkk∇uk_{L}2(Ω)< δεk∆uk_{L}2(Ω) and kβkk∇uk_{L}2(Ω)≥δεk∆uk_{L}2(Ω).
First case. From the definition, we have

K^{0}M^{−1}Ku, u

= sup

06=w∈L^{2}(Ω)

(β· ∇u−ε∆u, w)^{2}_{L}2(Ω)

kwk^{2}_{L}2(Ω)

.

By settingw=−ε∆u, we get using the Cauchy–Schwarz inequality that
K^{0}M^{−1}Ku, u

≥((β· ∇u,−ε∆u)_{L}2(Ω)+ε^{2}k∆uk^{2}_{L}2(Ω))^{2}
ε^{2}k∆uk^{2}_{L}2(Ω)

≥(−kβkk∇ukL^{2}(Ω)εk∆ukL^{2}(Ω)+ε^{2}k∆uk^{2}_{L}2(Ω))^{2}
ε^{2}k∆uk^{2}_{L}2(Ω)

= (−kβkk∇uk_{L}2(Ω)+εk∆uk_{L}2(Ω))^{2}.
Using the first inequality in (2.9), we obtain

K^{0}M^{−1}Ku, u

≥(−kβkk∇uk_{L}2(Ω)+εk∆uk_{L}2(Ω))^{2}≥((1−δ)εk∆uk_{L}2(Ω))^{2}

= ε^{4}

(ε+cPkβk)^{2}k∆uk^{2}_{L}2(Ω).
Second case. By settingw=u, we get

K^{0}M^{−1}Ku, u

≥((β· ∇u, u)_{L}2(Ω)+εk∇uk^{2}_{L}2(Ω))^{2}

kuk^{2}_{L}_{2}_{(Ω)} = ε^{2}k∇uk^{4}_{L}2(Ω)

kuk^{2}_{L}_{2}_{(Ω)}

using integration by parts. Due to the homogeneous Dirichlet boundary conditions
and∇ ·β = 0, the term (β· ∇u, w)_{L}2(Ω) is skew symmetric and vanishes forw=u.

Finally, we use kukL^{2}(Ω) ≤ c_{p}k∇ukL^{2}(Ω) and the second inequality in (2.9), which
gives

K^{0}M^{−1}Ku, u

≥ε^{2}k∇uk^{4}_{L}2(Ω)

kuk^{2}_{L}_{2}_{(Ω)} ≥ ε^{2}

c^{2}_{P}k∇uk^{2}_{L}2(Ω)

≥ δ^{2}ε^{4}

kβk^{2}c^{2}_{P}k∆uk^{2}_{L}2(Ω)= ε^{4}

(ε+c_{P}kβk)^{2}k∆uk^{2}_{L}2(Ω).

To summarize, in both cases we get
hSuu, ui=kuk^{2}_{L}2(O)+α

K^{0}M^{−1}Ku, u

≥α

K^{0}M^{−1}Ku, u

≥α ε^{4}

(ε+cPkβk)^{2}k∆uk^{2}_{L}2(Ω)≥α C_{Ω}ε^{4}

(ε+cPkβk)^{2}kuk^{2}_{H}2(Ω).
The last inequality follows from Lemma2.3.

Theorem2.2and Theorem2.4show that Problem2.1is well-posed with respect to the norms in (2.4). The boundedness and coercivity constants are bounded indepen- dent from the regularization parameterαas well as the problem parametersε,βand σ. Consequently, the operator preconditioner (2.2) is a robust preconditioner for the optimality system, that is, the condition number is uniformly bounded independently of the above mentioned parameters. So far, we have only analyzed the problem on the continuous level. In the next section, we carry this analysis over to the discrete case and provide a computationally feasible preconditioner.

3. Analysis of the discrete problem. We consider conforming discretizations, that is, we choose the finite-dimensional spacesQh andUh such that they satisfy

Qh⊂L^{2}(Ω) and Uh⊂H^{2}(Ω)∩H_{0}^{1}(Ω).

Applying Galerkins principle to (2.1) leads to the discrete variational problem for the functions (qh, wh, uh) ∈ Qh×Qh×Uh, which we immediately write in matrix- vector notation. We denote the vector representation of functions in these spaces by underlined versions of the corresponding symbols, i.e., forqh∈Qh the corresponding coefficient vector isq

h ∈R^{dim}^{Q}^{h}. With a slight abuse of notation, we use the same
notation also for the right-hand-side vectors f_{h} and u_{d,h}, which are obtained by
testing the corresponding linear functionals with the basis functions in Qh and Uh,
respectively, see, e.g., [18, Section 6] for further details. Furthermore, operators with
subscripthdenote matrix representations of the operators.

Using this notation, the discrete problem reads as follows.

Problem 3.1. Find (q_{h}, w_{h}, u_{h})∈R^{dim}^{Q}^{h}×R^{dim}^{Q}^{h}×R^{dim}^{U}^{h} such that

(3.1) Ah

qh

w_{h}
u_{h}

=

0
f_{h}
u_{d,h}

with Ah:=

αM_{h} M_{h} 0

M_{h} 0 K_{h}

0 K_{h}^{T} M_{O,h}

.

The exact Schur complement preconditioner (2.2) of the discretized system is

(3.2) S(Ah) =

αMh 0 0

0 _{α}^{1}Mh 0

0 0 M_{O,h}+αK_{h}^{T}M_{h}^{−1}K_{h}

.

Under the mild conditionUh⊆Qh this preconditioner is symmetric positive definite.

This is a straight forward extension of [25, Lemma 4.4] by using the fact that (β·

∇uh, uh)L^{2}(Ω)= 0. In this case, Theorem 2.2yields the following condition number
bound:

κ (S(A_{h}))^{−1}A_{h}

≤ cos(π/7)

sin(π/14) ≈4.05.

This preconditioner cannot be efficiently realized since the matrixM_{h}^{−1} is dense. So,
we use the following preconditioner motivated by the norms (2.4) on the discretization
spaces instead:

(3.3) Sh:=

αMh 0 0

0 ^{1}_{α}Mh 0

0 0 MO,h+αBh

,

where Bh is the matrix representation of the linear operator B : H^{2}(Ω)∩H_{0}^{1}(Ω) →
(H^{2}(Ω)∩H_{0}^{1}(Ω))^{0},

(3.4) hBu,ui˜ = (−ε∆u+β· ∇u+σu,−ε∆˜u+β· ∇˜u+σ˜u)_{L}2(Ω)

onUh. On the continuous level, the operatorsB andK^{0}M^{−1}K coincide. In general,
this does not carry over to the discrete case. The following lemma gives sufficient
conditions that guarantee thatBh andK_{h}^{T}M_{h}^{−1}Kh coincide.

Lemma 3.2. If

(3.5) (−ε∆ +β· ∇+σ)Uh⊂Qh,

thenK_{h}^{T}M_{h}^{−1}K_{h}=B_{h} and thusS(A_{h}) =S_{h}.

Proof. Letu_{h}∈U_{h}be arbitrary but fixed with coefficient vector u_{h}. The defini-
tions yield

K_{h}^{T}M_{h}^{−1}Khu_{h}, u_{h}

= sup

w_{h}∈R^{dim}^{Qh}

hKhu_{h}, w_{h}i^{2}
hMhw_{h}, w_{h}i

= sup

wh∈Qh

(−ε∆uh+β· ∇uh+σuh, wh)^{2}_{L}2(Ω)

kw_{h}k^{2}_{L}2(Ω)

.

Since (−ε∆ +β· ∇+σ)U_{h} ⊂Q_{h}, the supremum is attained for w_{h}=−ε∆uh+β·

∇uh+σu_{h}, and we have
K_{h}^{T}M_{h}^{−1}Khu_{h}, u_{h}

= sup

w_{h}∈Qh

(−ε∆uh+β· ∇uh+σuh, wh)^{2}_{L}2(Ω)

kwhk^{2}_{L}2(Ω)

=k −ε∆uh+β· ∇uh+σuhk^{2}_{L}2(Ω)=hBhu_{h}, u_{h}i.
Therefore,K_{h}^{T}M_{h}^{−1}Kh=Bh and thusS(Ah) =Sh.

4. Isogeometric analysis. Due to requirementUh⊂H^{2}(Ω)∩H_{0}^{1}(Ω), we need
a smooth discretization space. We achieve this by using IgA. We give a brief intro-
duction to the approximation spaces in use. Let Sp,k,`(0,1) be the space of B-spline
functions on the unit interval (0,1) which arek-times continuously differentiable and
piecewise polynomials of degreepon a uniform grid with grid size 2^{−`}. For the space
of B-splines with maximum continuity, that is, withk=p−1, we only writeSp,`.

On the parameter domain Ω := (0,b 1)^{d}, we use a tensor-product B-spline space,
denoted by

S_{p,k,`}^{d} :=

d

O

i=1

S_{p,k,`}(0,1).

For ease of notation, we assume to have the same spline degreep, the same continuity kand the same number of uniform refinement steps`, for each spatial dimension. We assume that the domain Ω can be parametrized by a geometry mappingG:Ωb →Ω = G(bΩ) with the property

(4.1) k∇^{r}Gk_{L}∞(bΩ)≤c1 and k(∇^{r}G)^{−1}k_{L}∞(bΩ)≤c2, for r= 1,2,3,
for some constants c_{1} and c_{2}. The discretization space S_{p,k,`} on the domain Ω is
defined using the pull-back principle as

S_{p,k,`}(Ω) :=

f◦G^{−1}:f ∈S_{p,k,`}^{d} .

For more information on IgA, see the survey article [5] and the references therein. We
use spline spaces with maximum smoothness as the discrete state space and reduce
the smoothness forQ_{h} accordingly. More precisely, we use

(4.2) Qh:=Sp,p−3,`(Ω) and Uh:=Sp,`(Ω)∩H_{0}^{1}(Ω) with p≥2.

The following lemma shows that, if we consider the special case of box domains (if trivially parametrized) and constant convection, the condition of Lemma3.2holds.

Theorem 4.1. If Ω = (0,1)^{d}, Qh =S_{p,p−3,`}^{d} and Uh =S_{p,`}^{d} ∩H_{0}^{1}(Ω) and if the
convectionβ is constant, then

S(Ah) =Sh and κ S_{h}^{−1}Ah

≤ cos(π/7)

sin(π/14) ≈4.05.

Proof. For sake of simplicity, we restrict the proof to the two-dimensional case.

Clearly,

f^{0} ∈Sp−1,k−1,`(0,1) ∀f ∈Sp,k,`(0,1)
together with

S_{p−1,k−1,`}(0,1)⊂S_{p,k−1,`}(0,1) and Sp,k,`(0,1)⊂S_{p,k−2,`}(0,1),
see [5]. Letu∈S_{p,k,`}^{2} =Sp,k,`(0,1)⊗Sp,k,`(0,1). Then,

∂^{2}u

∂x^{2}_{1} ∈S_{p−2,k−2,`}(0,1)⊗S_{p,k,`}(0,1)⊂S_{p,k−2,`}^{2} ,

∂^{2}u

∂x^{2}_{2} ∈Sp,k,`(0,1)⊗S_{p−2,k−2,`}(0,1)⊂S_{p,k−2,`}^{2} ,
and, by combining these results, also

∆u= ∂^{2}u

∂x^{2}_{1} +∂^{2}u

∂x^{2}_{2} ∈S_{p,k−2,`}^{2} .
Sinceβ= (β_{1}, β_{2}) is constant we also have

β· ∇u=β_{1}∂u

∂x1

+β_{2}∂u

∂x2

∈S_{p,k−2,`}^{2} .
To summarize, for everyu∈S_{p,k,`}^{2} , we have

−ε∆u+β· ∇u+σu∈S_{p,k−2,`}^{2} ,

hence Condition (3.5) in Lemma 3.2holds and we haveS(Ah) =Sh. The condition number bound follows from Theorem2.2.

If the domain is not a box domain or if the convection is not constant, then Theorem 4.1 cannot be applied. Nevertheless, the numerical results presented in Section7indicate that the spaces proposed in (4.2) work well even for more complex domains and variable convection.

5. Error estimates. In this section, we derive discretization error estimates for
Problem 3.1. Let A(x,x) denote the bilinear form in Problem˜ 2.1, wherex,x˜ ∈ X
denotes the triplet (q, w, u)∈L^{2}(Ω)×L^{2}(Ω)×H^{2}(Ω)∩H_{0}^{1}(Ω), with the norm

kxk^{2}_{S} =kqk^{2}_{S}_{q}+kwk^{2}_{S}_{w}+kuk^{2}_{S}_{u}.

The discrete triplet (qh, wh, uh)∈ Qh×Qh×Uh is denoted byxh ∈Xh. Galerkin orthogonality reads as follows:

(5.1) A(x−xh,yh) = 0 ∀yh∈Xh.

Since Problem2.1is well-posed (Theorem2.2), we have boundedness (5.2) A(x,y)≤ckxkSkykS ∀x,y∈X

and inf-sup stability

(5.3) sup

06=y∈X

A(x,y)

kyk_{S} ≥ckxkS ∀x∈X,

where c/c =κ(S^{−1}A). The boundedness also holds for a conforming discretization
spaceXh ⊂Xwith the same constant c. If the condition (3.5) in Lemma3.2holds,
then also the inf-sup holds with the same constantc. Using this and the ideas of [1],
we derive the following discretization error estimate.

Lemma 5.1. If x_{h}∈X_{h} is the solution to (3.1) for a discretization space satisfy-
ing condition (3.5) and ifx∈Xis the solution of (2.1), then we have the estimate
(5.4) kx−x_{h}kS ≤(1 +κ(S^{−1}A)) inf

y_{h}∈Xh

kx−y_{h}kS.

Proof. Letyhandzhbe arbitrary functions inXh. Due to Galerkin orthogonality (5.1), we have

A(xh−yh,zh) =A(xh−x+x−yh,zh) =A(x−yh,zh).

Combining this with the boundedness condition (5.2) gives

A(xh−yh,zh)≤ckx−yhkSkzhkS ∀yh,zh∈Xh. Using the discrete inf-sup gives

ckxh−y_{h}kS ≤ sup

06=zh∈Xh

A(x_{h}−y_{h},z_{h})
kzhkS

≤ckx−y_{h}kS ∀y_{h}∈X_{h}.

Finally, we use the triangle inequality andc/c=κ(S^{−1}A) to obtain the desired result
kx−xhkS ≤ kx−yhkS+kxh−yhkS ≤(1 +κ(S^{−1}A))kx−yhkS ∀yh∈Xh.

Let Ω = Ω := (0,b 1)^{d} and let the convection β be constant. Since we assume a
trivial parametrization, we have the discretization spaces Qh = S_{p,pư3,`}^{d} and Uh =
S_{p,`}^{d} ∩H_{0}^{1}(bΩ).

To derive the error estimates, we assume that the solution of (2.1) satisfies the
regularity assumption (q, w, u)∈H^{1}(bΩ)×H^{1}(bΩ)×H^{3}(bΩ)∩H_{0}^{1}(bΩ).

Now, we estimate the approximation error term in (5.4) from above. First, we observe that

inf

yh∈Xh

kxưyhk_{S} ≤
inf

q_{h}∈S_{p,pư3,`}^{d} αkqưqhk^{2}_{L}_{2}_{(b}_{Ω)}+ inf

w_{h}∈S^{d}_{p,pư3,`}

1

αkwưwhk^{2}_{L}_{2}_{(b}_{Ω)}+ inf

u_{h}∈S^{d}_{p,`}∩H^{1}_{0}(bΩ)

kuưuhk^{2}_{S}_{u}.

The two first terms can be bounded by using the following approximation error esti- mate

(5.5) inf

qh∈S^{d}_{p,pư3,`}kqưq_{h}k_{L}2(bΩ)≤ h
4√

3k∇qk_{L}2(bΩ),
see [22, Corollary 1]. The estimate for the last term

inf

u_{h}∈S_{p,`}^{d} ∩H_{0}^{1}(bΩ)

kuưuhk^{2}_{S}_{u}

is more involved. Before handling this term, we need a convenient notation and some auxiliary approximation error estimates.

Notation 5.2. In what follows, c is a generic positive constant independent ofα, σ, β, ε, h and p, but may depend on the spatial dimension d and the observation domainO.

In Appendix B, we extend some of the results of [22, 23, 24]. The main result is summarized in the following theorem.

Theorem 5.3. Let Π_{p} : H^{3}(bΩ)∩H_{0}^{1}(Ω)b →S^{d}_{p,`}∩H_{0}^{1}(bΩ) be the H^{2}-orthogonal
projector, wherep≥3 and`≥1. Then,

k∇^{2}(IưΠp)uk_{L}2(bΩ)≤ch k∇^{3}uk_{L}2(bΩ),
(5.6)

k∇(IưΠp)uk_{L}2(bΩ)≤ch^{2}k∇^{3}uk_{L}2(bΩ),
(5.7)

k(IưΠp)uk_{L}2(bΩ)≤ch^{3}k∇^{3}uk_{L}2(bΩ) ∀u∈H^{3}(bΩ)∩H_{0}^{1}(bΩ).

(5.8)

Remark 5.4. In the statement of Theorem5.3, the parameter domainΩ is consid-b ered. This result can be extended to physical domains if the corresponding geometry functionGis sufficiently smooth, cf. [24].

With Theorem5.3, we can derive an error estimate for our problem.

Theorem 5.5. Let Ω :=Ω := (0,b 1)^{d} with d∈N, let β ∈R^{d} be constant and let
(qh, wh, uh)∈S_{p,pư3,`}^{d} ×S^{d}_{p,pư3,`}×S_{p,`}^{d} ∩H_{0}^{1}(bΩ)withp≥3and`≥1, be the solution
to (3.1). If(q, w, u)∈H^{1}(bΩ)×H^{1}(bΩ)×H^{3}(bΩ)∩H_{0}^{1}(bΩ)is the solution of (2.1), then
we have the following estimates:

kqưqhkS_{q}+kwưwhkS_{w}+kuưuhkS_{u}≤
ch

√

αk∇qk_{L}2(bΩ)+ 1

√αk∇wk_{L}2(bΩ)+√
αmax

ε,kβkh,

σ+ 1

√α

h^{2}

k∇^{3}uk_{L}2(bΩ)

Proof. Let ˜uh:=Πpu. Then, we have for allu∈H^{3}(bΩ)∩H_{0}^{1}(Ω)b
kuưu˜_{h}k^{2}_{S}_{u} =kuưu˜_{h}k^{2}_{L}2(O)+αk(ưε∆ +β· ∇+σ)(uưu˜_{h})k^{2}_{L}_{2}_{(b}_{Ω)}.
For the first term, we simply extend the norm fromOtoΩ and obtainb

kuưu˜hk^{2}_{L}2(O)≤ kuưu˜hk^{2}

L^{2}(bΩ)≤c h^{6}k∇^{3}uk^{2}

L^{2}(bΩ).

For the second term, we use the triangle inequality and Theorem5.3, and we get
αk(ưε∆ +β· ∇+σ)(uư˜uh)k^{2}_{L}_{2}_{(b}_{Ω)}

≤3α

ε^{2}k∆(uưu˜h)k^{2}_{L}_{2}_{(b}_{Ω)}+kβk^{2}k∇(uưu˜h)k^{2}_{L}_{2}_{(b}_{Ω)}+σ^{2}kuưu˜hk^{2}_{L}_{2}_{(b}_{Ω)}

≤c α ε^{2}h^{2}+kβk^{2}h^{4}+σ^{2}h^{6}

k∇^{3}uk^{2}_{L}_{2}_{(b}_{Ω)}.
Combining these results gives

(5.9) inf

˜

uh∈S_{p,`}^{d} ∩H_{0}^{1}(bΩ)

kuưu˜hkS_{u} ≤c√

α hmax

ε,kβkh,(σ+ 1/√

α)h^{2} k∇^{3}uk_{L}2(bΩ).

The theorem follows from combining Lemma 5.1 with Equation (5.5) and Equa- tion (5.9).

Remark 5.6. Theorem5.5is somewhat restrictive since it requires the domain to be a box domain and the convection to be constant. These requirements are needed for the proof of the discrete inf-sup stability. The numerical results presented in Section7 indicate that the restrictions are not needed.

6. Numerical experiments: Accuracy of the solution. In this section we compare the solution of the forward problem to the (state) solution of the optimal control problem to investigate the fact that the optimal control problem naturally in- troduces a non-standard Petrov-Galerkin method for the state equation. We consider a well-known one-dimensional problem [7]:

ư∂_{x}u(x)ưε ∂_{xx}u(x) = 0 in (0,1), u(0) = 0, u(1) = 1,
whose exact solution is

u(x) =e^{ưx/ε}ư1
e^{ư1/ε}ư1.

This problem is used as state equation in our optimal control problem (Problem1.1),
where we choose β =ư1, σ= 0, f = 0, u_{d} to be the exact solution of the forward
problem, and the boundary conditions on uto be as for the forward problem. The
analytical solution of the optimal control problem is

q(x) = 0, w(x) = 0, u(x) = e^{ưx/ε}ư1
e^{ư1/ε}ư1.
We use the discretization spaces

Qh=Sp,pư3,`(0,1) and Uh={uh∈Sp,`(0,1) : u(0) = 0, u(1) = 1}

for the optimal control problem, which satisfy the condition (3.5). We compare the numerical solution for the state with the numerical solution of the forward problem,

where we use Uh as trial and test space. No stabilization techniques are used. The
diffusion is set to ε = 0.01 and α= 0.001 for the optimal control problem. Three
observation domains are considered: Full observation, that is, O = Ω = (0,1), and
partial observation on O = (0,^{1}_{4}) and on O = (^{3}_{4},1). The numerical solutions are
displayed in Figures 1 to4. The plots indicate that the forward solution is unstable
for coarse discretizations. The non-physical oscillations start in the boundary layer
and propagate into the remainder of the computational domain. These kinds of insta-
bilities are often remedied by upwind and/or Petrov-Galerkin schemes [7]. We remark
though that our Petrov-Galerkin like approach for the state equation does not resem-
ble any of the common Petrov-Galerkin schemes for this equation, as far as we know.

The state solution (of the optimal control problem) does not have these instabilities.

In fact, the state operatorK_{h}is discretized with a PetrovGalerkin method as the trial
space isU_{h} and the test space isQ_{h}.

0.0 0.2 0.4 0.6 0.8 1.0

x 0.0

0.2 0.4 0.6 0.8 1.0 1.2

u(x)

Optimal control solution Forward problem solution Exact solution

0.0 0.2 0.4 0.6 0.8 1.0

x 0.0

0.2 0.4 0.6 0.8 1.0

u(x)

Optimal control solution Forward problem solution Exact solution

Fig. 1.Full observation onO= (0,1)withp= 2(both),`= 4(left),`= 6(right).

0.0 0.2 0.4 0.6 0.8 1.0

x 0.0

0.2 0.4 0.6 0.8 1.0 1.2

u(x)

Optimal control solution Forward problem solution Exact solution

0.0 0.2 0.4 0.6 0.8 1.0

x 0.0

0.2 0.4 0.6 0.8 1.0

u(x)

Optimal control solution Forward problem solution Exact solution

Fig. 2.Partial observation onO= (0,^{1}_{4})withp= 2(both),`= 4(left),`= 6(right).

In Figure2, we consider the optimal control problem with observation on (0,^{1}_{4}).

This is only a quarter of the whole domain, but it is located at the boundary layer.

The solutions for the state variable are almost identical to those obtained for the case of full observation.

Next we look at the solution where the observation domain is (^{3}_{4},1). Here the
solution is almost constant (u(x)≈1). From Figures3and4, we see that the approx-
imation is not good. In the left plot of Figure3, we see that the boundary layer is not
captured. However, the error does not propagate into the observation domain. Forh-
refinement, see Figure3(right) and Figure4(left), we observe that the approximation

0.0 0.2 0.4 0.6 0.8 1.0 x

0.0 0.2 0.4 0.6 0.8 1.0 1.2

u(x)

Optimal control solution Forward problem solution Exact solution

0.0 0.2 0.4 0.6 0.8 1.0

x 0.0

0.2 0.4 0.6 0.8 1.0

u(x)

Optimal control solution Forward problem solution Exact solution

Fig. 3.Partial observation onO= (^{3}_{4},1)withp= 2(both),`= 4(left),`= 6(right).

0.0 0.2 0.4 0.6 0.8 1.0

x 0.0

0.2 0.4 0.6 0.8 1.0

u(x)

Optimal control solution Forward problem solution Exact solution

0.0 0.2 0.4 0.6 0.8 1.0

x 0.0

0.2 0.4 0.6 0.8 1.0

u(x)

Optimal control solution Forward problem solution Exact solution

Fig. 4.Partial observation onO= (^{3}_{4},1)withp= 2,`= 8(left) andp= 4,`= 6(right).

improves slowly. Forp-refinement, see Figure4 (right), the approximation improves significantly. Since we use splines, increasing the spline degree by one means that the number of degrees of freedom is only increased by one, while eachh-refinement step doubles the number of degrees of freedom.

Remark 6.1. The effect of increasing the spline degree compared toh-refinement as shown in Figure4 is somewhat surprising. We are not completely sure why larger spline degrees are so effective. Unfortunately, the error estimate in Theorem5.5does not provide any explanation for this behavior. Further analysis is needed to explain this properly.

7. Numerical experiments for exact and inexact preconditioners. In this section, we analyze the convergence of Krylov space solvers when the proposed preconditioner is used. In the first subsection, we consider an exact realization of the preconditioner. A multigrid approximation is then considered in the second subsec- tion.

We have done the numerical experiments for two model domains, in both cases ford= 2. The first domain is a box-domain, more precisely, Ω is the unit square, see Figure 5 (left), which is parameterized with the identity function. For this domain, the conditions of Theorem 4.1 are satisfied. In Model problem 7.1, we have full observation and in Model problem7.2, the observation is restricted to the subdomain represented by the smaller area in Figure5 (left). In all model problems, the desired state ud is a step function with value ud = 1 inside a circle and with value ud = 0 outside the circle. The support ofud is shown as the dashed lines in Figure5.

Fig. 5.Computational domainsΩ, partial observation domainsO(dark blue), and support of desired state (inside of dashed circles).

Model problem7.1 (Unit square and constant convection with full observa-
tion). LetΩ :=O:= (0,1)^{2} be the computational domain, which is also the observa-
tion domain. The convection is β= (−2,1)and there is no reaction σ= 0 or source
term f = 0. The desired state is

ud(x, y) =

(1 if (x−^{3}_{8})^{2}+ (y−^{5}_{8})^{2}≤ _{16}^{1}
0 otherwise.

The diffusionεand regularization parameterαwill vary.

Model problem7.2 (Unit square and constant convection with limited obser-
vation). Let Ω = (0,1)^{2} be the computational domain andO= (^{1}_{4},^{3}_{4})^{2} be the obser-
vation domain. The remainder of this problem is the same as for Model problem7.1.

Furthermore, we consider a non-trivial geometry Ω, which is a approximation of a quarter annulus by means of a B-spline parameterization, see Figure5(right). Again, Model problem7.3is a problem with full observation and the observation domain in Model problem7.4 is the smaller area in Figure 5 (right). We observe that for this domain, the conditions of Theorem4.1are not satisfied.

Model problem7.3 (Quarter annulus and varying convection with full obser-
vation). Let Ω =O=G((0,1)^{2}) withG: (0,1)^{2}→R^{2} and

(7.1) G(bx) =

(1 +bx_{1})(1−xb_{2})(1 + 2(√

2−1)bx_{2})
(1 +bx1)bx2(2√

2−1−2(√

2−1)bx2)

be the computational domain, which is also the observation domain. The convection
isβ = (y,1 +x^{2})and there is no reaction σ= 0or source term f = 0. The desired
state is

ud(x, y) =

(1, if (x−x0)^{2}+ (y−y0)^{2}≤ _{16}^{1}
0, otherwise,

where(x_{0}, y_{0}) =G(^{3}_{8},^{5}_{8}). The diffusionεand regularization parameterαwill vary.

Model problem7.4 (Quarter annulus and varying convection with limited ob-
servation). Let Ω =G((0,1)^{2}) be the computational domain andO=G((^{1}_{4},^{3}_{4})^{2}) be
the observation domain, whereGis as in (7.1). The remainder of this problem is the
same as for Model problem7.3.

For all model problems, we consider a discretization of the optimality system using the spaces given in (4.2) as outlined in Section 4. The resulting linear system of equations

Ahx_{h}=b_{h}

is solved using the MINRES method, preconditioned with the proposed Schur comple-
ment preconditioner (3.3). We use a random initial guessx_{h,0}. The stopping criterion
is

kr_{k}k ≤10^{−8}kr_{0}k,

wherer_{k} :=b_{h}− Ahx_{h,k} denotes the residual andk · kis the Euclidean norm.

7.1. Results with exact preconditioner. In this section, we present the re- sults for the Schur complement preconditioner (3.3) when realized using a sparse Cholesky decomposition.

Table1shows the iteration numbers needed to reach the stopping criteria for full
and partial observation (Model problems 7.1 and 7.2). In these tables, αand ε are
varied, while p= 2 and` = 6 are fixed. In Table2, we set ε = 10^{−3} and vary the
refinement level`andα. From the tables, we observe that for the partial observation
problem, we need a few more iterations for small values ofα. This is probably because
M_{O,h} is singular in case of partial observation. The iteration numbers are relatively
small for all considered values ofα,εand`. This is predicted by the theory as Model
problems7.1and7.2satisfy the conditions of Theorem4.1.

Table 1

Iteration numbers: Model problem7.1(left) and7.2(right),p= 2,`= 6.

ε\α 10^{0} 10^{−3} 10^{−6} 10^{−9}

10^{0} 12 26 60 72

10^{−3} 15 47 26 11

10^{−6} 15 46 26 11

10^{−9} 15 46 26 11

ε\α 10^{0} 10^{−3} 10^{−6} 10^{−9}

10^{0} 12 20 57 78

10^{−3} 15 41 54 19

10^{−6} 14 41 53 19

10^{−9} 14 41 53 19

Table 2

Iteration numbers: Model problem7.1(left) and7.2(right),p= 2,ε= 10^{−3}.

`\α 10^{0} 10^{−3} 10^{−6} 10^{−9}

4 15 46 14 7

5 15 47 19 8

6 15 47 26 11

7 15 46 38 11

`\α 10^{0} 10^{−3} 10^{−6} 10^{−9}

4 15 46 35 15

5 15 43 44 17

6 15 41 54 19

7 15 39 55 22

Next, we consider Model problems7.3and7.4, which are the problems where the computational domain is a quarter annulus. The iteration numbers are shown in Ta- bles3and4. Note that the conditions of Theorem4.1are not satisfied. Nevertheless, the iteration numbers are comparable with those of Tables1and2.

Remark 7.5. Forε= 1, the iteration numbers in Table1(and Table3) are grow-
ing as α becomes smaller. This may appear strange since we have proven that the
condition number (for Table 1) is less then 4.05. Describing convergence estimates
for Krylov subspace methods in term of only the condition number can be misleading
and/or insufficient, cf. [16]. The different iteration numbers for various values of ε
and α can be explained by the distribution of the eigenvalues. For small iteration
numbers the eigenvalues are more clustered. Forε= 1, the iteration numbers starts
decreasing whenα <10^{−9}.