/[escript]/trunk/doc/inversion/CostFunctions.tex
ViewVC logotype

Contents of /trunk/doc/inversion/CostFunctions.tex

Parent Directory Parent Directory | Revision Log Revision Log


Revision 6681 - (show annotations)
Mon Jun 11 05:40:04 2018 UTC (14 months, 1 week ago) by jfenwick
File MIME type: application/x-tex
File size: 20906 byte(s)
Deal with overfull boxes

Scaled some of the compound figures to 95%

1 \chapter{Cost Function}\label{chapter:ref:inversion cost function}
2 The general form of the cost function minimized in the inversion is given in the form (see also Chapter~\ref{chapter:ref:Drivers})
3 \begin{equation}\label{REF:EQU:DRIVE:10}
4 J(m) = J^{reg}(m) + \sum_{f} \mu^{data}_{f} \cdot J^{f}(p^f)
5 \end{equation}
6 where $m$ represents the level set function, $J^{reg}$ is the regularization term, see Chapter~\ref{Chp:ref:regularization},
7 and $J^{f}$ are a set of cost functions for forward models, (see Chapter~\ref{Chp:ref:forward models}) depending on
8 physical parameters $p^f$. The physical parameters $p^f$ are known functions
9 of the level set function $m$ which is the unknown to be calculated by the optimization process.
10 $\mu^{data}_{f}$ are trade-off factors. It is pointed out that the regularization term includes additional trade-off factors
11 The \class{InversionCostFunction} is class to define cost functions of an inversion. It is pointed out that
12 the \class{InversionCostFunction} class implements the \class{CostFunction} template class, see Chapter~\ref{chapter:ref:Minimization}.
13
14 In the simplest case there is a single forward model using a single physical parameter which is
15 derived form single-values level set function. The following script snippet shows the creation of the
16 cost function for the case of a gravity inversion:
17 \begin{verbatim}
18 p=DensityMapping(...)
19 f=GravityModel(...)
20 J=InversionCostFunction(Regularization(...), \
21 mappings=p, \
22 forward_models=f)
23 \end{verbatim}
24 The argument \verb|...| refers to an appropriate argument list.
25
26 If two forward models are coming into play using two different physical parameters
27 the \member{mappings} and \member{forward_models} are defined as lists in the following form:
28 \begin{verbatim}
29 p_rho=DensityMapping(...)
30 p_k=SusceptibilityMapping(...)
31 f_mag=MagneticModel(...)
32 f_grav=GravityModel(...)
33
34 J=InversionCostFunction(Regularization(...), \
35 mappings=[p_rho, p_k], \
36 forward_models=[(f_mag, 1), (f_grav,0)])
37 \end{verbatim}
38 Here we define a joint inversion of gravity and magnetic data. \member{forward_models} is given as a list of
39 a tuple of a forward model and an index which referring to parameter in the \member{mappings} list to be used as an input.
40 The magnetic forward model \member{f_mag} is using the second parameter (=\member{p_k}) in \member{mappings} list.
41 In this case the physical parameters are defined by a single-valued level set function. It is also possible
42 to link physical parameters to components of a level set function:
43 \begin{verbatim}
44 p_rho=DensityMapping(...)
45 p_k=SusceptibilityMapping(...)
46 f_mag=MagneticModel(...)
47 f_grav=GravityModel(...)
48
49 J=InversionCostFunction(Regularization(numLevelSets=2,...), \
50 mappings=[(p_rho,0), (p_k,1)], \
51 forward_models=[[(f_mag, 1), (f_grav,0)])
52 \end{verbatim}
53 The \member{mappings} argument is now a list of pairs where the first pair entry specifies the parameter mapping and
54 the second pair entry specifies the index of the component of the level set function to be used to evaluate the parameter.
55 In this case the level set function has two components, where the density mapping uses the first component of the level set function
56 while the susceptibility mapping uses the second component.
57
58 \section{\class{InversionCostFunction} API}\label{chapter:ref:inversion cost function:api}
59
60 The \class{InversionCostFunction} implements a \class{CostFunction} class used
61 to run optimization solvers, see Section~\ref{chapter:ref:Minimization: costfunction class}.
62 Its API is defined as follows:
63
64 \begin{classdesc}{InversionCostFunction}{regularization, mappings, forward_models}
65 Constructor for the inversion cost function. \member{regularization} sets the regularization to be used, see Chapter~\ref{Chp:ref:regularization}.
66 \member{mappings} is a list of pairs where each pair comprises of a
67 physical parameter mapping (see Chapter~\ref{Chp:ref:mapping}) and an index which refers to the component of level set function
68 defined by the \member{regularization} to be used to calculate the corresponding physical parameter.
69 If the level set function has a single component the index can be omitted.
70 If in addition there is a single physical parameter the mapping can be given instead of a list.
71 \member{forward_models} is a list of pairs where the first pair component is a
72 forward model (see Chapter~\ref{Chp:ref:forward models}) and the second pair
73 component refers to the physical parameter in the \member{mappings} list
74 providing the physical parameter for the model.
75 If a single physical parameter is present the index can be omitted.
76 If in addition a single forward model is used this forward model can be
77 assigned to \member{forward_models} in replacement of a list.
78 The \member{regularization} and all \member{forward_models} must use the same
79 \class{ReferenceSystem}, see Section~\ref{sec:ref:reference systems}.
80 \end{classdesc}
81
82 \begin{methoddesc}[InversionCostFunction]{getDomain}{}
83 returns the \escript domain of the inversion, see~\cite{ESCRIPT}.
84 \end{methoddesc}
85
86 \begin{methoddesc}[InversionCostFunction]{getNumTradeOffFactors}{}
87 returns the total number of trade-off factors.
88 The count includes the trade-off factors $\mu^{data}_{f}$ for the forward
89 models and (hidden) trade-off factors in the regularization term,
90 see Definition~\ref{REF:EQU:DRIVE:10}.
91 \end{methoddesc}
92
93 \begin{methoddesc}[InversionCostFunction]{getForwardModel}{\optional{idx=\None}}
94 returns the forward model with index \member{idx}.
95 If the cost function contains one model only argument \member{idx} can be omitted.
96 \end{methoddesc}
97
98 \begin{methoddesc}[InversionCostFunction]{getRegularization}{}
99 returns the regularization component of the cost function, see \class{regularization} in Chapter~\ref{Chp:ref:regularization}.
100 \end{methoddesc}
101
102 \begin{methoddesc}[InversionCostFunction]{setTradeOffFactorsModels}{\optional{mu=\None}}
103 sets the trade-off factors $\mu^{data}_{f}$ for the forward model components.
104 If a single model is present \member{mu} must be a floating point number.
105 Otherwise \member{mu} must be a list of floating point numbers.
106 It is assumed that all numbers are positive.
107 The default value for all trade-off factors is one.
108 \end{methoddesc}
109
110 \begin{methoddesc}[InversionCostFunction]{getTradeOffFactorsModels}{}
111 returns the values of the trade-off factors $\mu^{data}_{f}$ for the forward model components.
112 \end{methoddesc}
113
114 \begin{methoddesc}[InversionCostFunction]{setTradeOffFactorsRegularization}{\optional{mu=\None}, \optional{mu_c=\None}}
115 sets the trade-off factors for the regularization component of the cost function.
116 \member{mu} defines the trade-off factors for the level-set variation part and
117 \member{mu_c} sets the trade-off factors for the cross-gradient variation part.
118 This method is a shortcut for calling \member{setTradeOffFactorsForVariation}
119 and \member{setTradeOffFactorsForCrossGradient} for the underlying the
120 regularization.
121 Please see \class{Regularization} in Chapter~\ref{Chp:ref:regularization} for
122 more details on the arguments \member{mu} and \member{mu_c}.
123 \end{methoddesc}
124
125 \begin{methoddesc}[InversionCostFunction]{setTradeOffFactors}{\optional{mu=\None}}
126 sets the trade-off factors for the forward model and regularization terms.
127 \member{mu} is a list of positive floats. The length of the list is the total
128 number of trade-off factors given by the method \method{getNumTradeOffFactors}.
129 The first part of \member{mu} defines the trade-off factors $\mu^{data}_{f}$
130 for the forward model components while the remaining entries define the
131 trade-off factors for the regularization components of the cost function.
132 By default all values are set to one.
133 \end{methoddesc}
134
135 \begin{methoddesc}[InversionCostFunction]{getProperties}{m}
136 returns the physical properties from a given level set function \member{m}
137 using the mappings of the cost function. The physical properties are
138 returned in the order in which they are given in the \member{mappings} argument
139 in the class constructor.
140 \end{methoddesc}
141
142 \begin{methoddesc}[InversionCostFunction]{createLevelSetFunction}{*props}
143 returns the level set function corresponding to set of given physical properties.
144 This method is the inverse of the \method{getProperties} method.
145 The arguments \member{props} define a tuple of values for the physical
146 properties where the order needs to correspond to the order in which the
147 physical property mappings are given in the \member{mappings} argument in the
148 class constructor. If a value for a physical property is given as \None the
149 corresponding component of the returned level set function is set to zero.
150 If no physical properties are given all components of the level set function
151 are set to zero.
152 \end{methoddesc}
153
154 \begin{methoddesc}[InversionCostFunction]{getNorm}{m}
155 returns the norm of a level set function \member{m} as a floating point number.
156 \end{methoddesc}
157
158 \begin{methoddesc}[InversionCostFunction]{getArguments}{m}
159 returns pre-computed values for the evaluation of the cost function and its
160 gradient for a given value \member{m} of the level set function.
161 In essence the method collects pre-computed values for the underlying
162 regularization and forward models\footnote{Using pre-computed values can
163 significantly speed up the optimization process when the value of the cost
164 function and its gradient are needed for the same level set function.}.
165 \end{methoddesc}
166
167 \begin{methoddesc}[InversionCostFunction]{getValue}{m\optional{, *args}}
168 returns the value of the cost function for a given level set function \member{m}
169 and corresponding pre-computed values \member{args}.
170 If the pre-computed values are not supplied \member{getArguments} is called.
171 \end{methoddesc}
172
173 \begin{methoddesc}[InversionCostFunction]{getGradient}{m\optional{, *args}}
174 returns the gradient of the cost function at level set function \member{m}
175 using the corresponding pre-computed values \member{args}.
176 If the pre-computed values are not supplied \member{getArguments} is called.
177 The gradient is represented as a tuple $(Y,X)$ where in essence $Y$ represents
178 the derivative of the cost function kernel with respect to the level set
179 function and $X$ represents the derivative of the cost function kernel with
180 respect to the gradient of the level set function, see
181 Section~\ref{chapter:ref:inversion cost function:gradient} for more details.
182 \end{methoddesc}
183
184 \begin{methoddesc}[InversionCostFunction]{getDualProduct}{m, g}
185 returns the dual product of a level set function \member{m} with a gradient
186 \member{g}, see Section~\ref{chapter:ref:inversion cost function:gradient} for more details.
187 This method uses the dual product of the regularization.
188 \end{methoddesc}
189
190 \begin{methoddesc}[InversionCostFunction]{getInverseHessianApproximation}{m, g \optional{, *args}}
191 returns an approximative evaluation of the inverse of the Hessian operator of
192 the cost function for a given gradient \member{g} at a given level set function
193 \member{m} using the corresponding pre-computed values \member{args}.
194 If no pre-computed values are present \member{getArguments} is called.
195 In the current implementation contributions to the Hessian operator from the
196 forward models are ignored and only contributions from the regularization and
197 cross-gradient term are used.
198 \end{methoddesc}
199
200
201 \section{Gradient calculation}\label{chapter:ref:inversion cost function:gradient}
202 In this section we briefly discuss the calculation of the gradient and the Hessian operator.
203 If $\nabla$ denotes the gradient operator (with respect to the level set function $m$)
204 the gradient of $J$ is given as
205 \begin{equation}\label{REF:EQU:DRIVE:10b}
206 \nabla J(m) = \nabla J^{reg}(m) + \sum_{f} \mu^{data}_{f} \cdot \nabla J^{f}(p^f) \; .
207 \end{equation}
208 We first focus on the calculation of $\nabla J^{reg}$. In fact the
209 regularization cost function $J^{reg}$ is given through a cost function
210 kernel\index{cost function!kernel} $K^{reg}$ in the form
211 \begin{equation}\label{REF:EQU:INTRO 2a}
212 J^{reg}(m) = \int_{\Omega} K^{reg} \; dx
213 \end{equation}
214 where $K^{reg}$ is a given function of the
215 level set function $m_k$ and its spatial derivative $m_{k,i}$. If $n$ is an increment to the level set function
216 then the directional derivative of $J^{ref}$ in the direction of $n$ is given as
217 \begin{equation}\label{REF:EQU:INTRO 2aa}
218 <n, \nabla J^{reg}(m)> = \int_{\Omega} \frac{ \partial K^{reg}}{\partial m_k} n_k + \frac{ \partial K^{reg}}{\partial m_{k,i}} n_{k,i} \; dx
219 \end{equation}
220 where $<.,.>$ denotes the dual product, see Chapter~\ref{chapter:ref:Minimization}. Consequently, the gradient $\nabla J^{reg}$
221 can be represented by a pair of values $Y$ and $X$
222 \begin{equation}\label{ref:EQU:CS:101}
223 \begin{array}{rcl}
224 Y_k & = & \displaystyle{\frac{\partial K^{reg}}{\partial m_k}} \\
225 X_{ki} & = & \displaystyle{\frac{\partial K^{reg}}{\partial m_{k,i}}}
226 \end{array}
227 \end{equation}
228 while the dual product $<.,.>$ of a level set increment $n$ and a gradient increment $g=(Y,X)$ is given as
229 \begin{equation}\label{REF:EQU:INTRO 2aaa}
230 <n,g> = \int_{\Omega} Y_k n_k + X_{ki} n_{k,i} \; dx
231 \end{equation}
232 We also need to provide (an approximation of) the value $p$ of the inverse of the Hessian operator $\nabla \nabla J$
233 for a given gradient increment $g=(Y,X)$. This means we need to (approximatively) solve the variational problem
234 \begin{equation}\label{REF:EQU:INTRO 2b}
235 <n,\nabla \nabla J p > = \int_{\Omega} Y_k n_k + X_{ki} n_{k,i} \; dx
236 \end{equation}
237 for all increments $n$ of the level set function. If we ignore contributions
238 from the forward models the left hand side takes the form
239 \begin{equation}\label{REF:EQU:INTRO 2c}
240 <n,\nabla \nabla J^{reg} p > = \int_{\Omega}
241 \displaystyle{\frac{\partial Y_k}{\partial m_l}} p_l n_k +
242 \displaystyle{\frac{\partial Y_k}{\partial m_{l,j}}} p_{l,j} n_k +
243 \displaystyle{\frac{\partial X_{ki}}{\partial m_l}} p_l n_{k,i} +
244 \displaystyle{\frac{\partial X_{ki}}{\partial m_{l,j}}} p_{l,j} n_{k,i}
245 \; dx
246 \end{equation} We follow the concept as outlined in section~\ref{chapter:ref:inversion cost function:gradient}.
247 Notice that equation~\ref{REF:EQU:INTRO 2b} defines a system of linear PDEs
248 which is solved using \escript \class{LinearPDE} class. In the \escript notation we need to provide
249 \begin{equation}\label{ref:EQU:REG:600}
250 \begin{array}{rcl}
251 A_{kilj} & = & \displaystyle{\frac{\partial X_{ki}}{\partial m_{l,j}}} \\
252 B_{kil} & = & \displaystyle{\frac{\partial X_{ki}}{\partial m_l}} \\
253 C_{klj} & = & \displaystyle{\frac{\partial Y_k}{\partial m_{l,j}}} \\
254 D_{kl} & = & \displaystyle{\frac{\partial Y_k}{\partial m_l}} \\
255 \end{array}
256 \end{equation}
257 The calculation of the gradient of the forward model component is more complicated:
258 the data defect $J^{f}$ for forward model $f$ is expressed using a cost function kernel $K^{f}$
259 \begin{equation}\label{REF:EQU:INTRO 2bb}
260 J^{f}(p^f) = \int_{\Omega} K^{f} \; dx
261 \end{equation}
262 In this case the cost function kernel $K^{f}$ is a function of the
263 physical parameter $p^f$, which again is a function of the level-set function,
264 and the state variable $u^f_{k}$ and its gradient $u^f_{k,i}$. For the sake of a simpler
265 presentation the upper index $f$ is dropped.
266
267 The gradient $\nabla_{p} J$ of the $J$ with respect to
268 the physical property $p$ is given as
269 \begin{equation}\label{REF:EQU:costfunction 100b}
270 <q, \nabla_{p} J(p)> = \int_{\Omega}
271 \displaystyle{\frac{\partial K }{\partial u_k } } \displaystyle{\frac{\partial u_k }{\partial q } } +
272 \displaystyle{\frac{\partial K }{\partial u_{k,i} } } \left( \displaystyle{\frac{\partial u_k }{\partial q } } \right)_{,i}+
273 \displaystyle{\frac{\partial K }{\partial p } } q \; dx
274 \end{equation}
275 for any $q$ as an increment to the physical parameter $p$. If the change
276 of the state variable
277 $u_f$ for physical parameter $p$ in the direction of $q$ is denoted as
278 \begin{equation}\label{REF:EQU:costfunction 100c}
279 d_k =\displaystyle{\frac{\partial u_k }{\partial q } }
280 \end{equation}
281 equation~\ref{REF:EQU:costfunction 100b} can be written as
282 \begin{equation}\label{REF:EQU:costfunction 100d}
283 <q, \nabla_{p} J(p)> = \int_{\Omega}
284 \displaystyle{\frac{\partial K }{\partial u_k } } d_k +
285 \displaystyle{\frac{\partial K }{\partial u_{k,i} } } d_{k,i}+
286 \displaystyle{\frac{\partial K }{\partial p } } q \; dx
287 \end{equation}
288 The state variable are the solution of PDE which in variational from is given
289 \begin{equation}\label{REF:EQU:costfunction 100}
290 \int_{\Omega} F_k \cdot r_k + G_{li} \cdot r_{k,i} \; dx = 0
291 \end{equation}
292 for all increments $r$ to the stat $u$. The functions $F$ and $G$ are given and describe the physical
293 model. They depend of the state variable $u_{k}$ and its gradient $u_{k,i}$ and the physical parameter $p$. The change
294 $d_k$ of the state
295 $u_f$ for physical parameter $p$ in the direction of $q$ is given from the equation
296 \begin{equation}\label{REF:EQU:costfunction 100bb}
297 \int_{\Omega}
298 \displaystyle{\frac{\partial F_k }{\partial u_l } } d_l r_k +
299 \displaystyle{\frac{\partial F_k }{\partial u_{l,j}} } d_{l,j} r_k +
300 \displaystyle{\frac{\partial F_k }{\partial p} }q r_k +
301 \displaystyle{\frac{\partial G_{ki}}{\partial u_l} } d_l r_{k,i} +
302 \displaystyle{\frac{\partial G_{ki}}{\partial u_{l,j}} } d_{l,j} r_{k,i}+
303 \displaystyle{\frac{\partial G_{ki}}{\partial p} } q r_{k,i}
304 \; dx = 0
305 \end{equation}
306 to be fulfilled for all functions $r$. Now let $d^*_k$ be the solution of the
307 variational equation
308 \begin{equation}\label{REF:EQU:costfunction 100dd}
309 \int_{\Omega}
310 \displaystyle{\frac{\partial F_k }{\partial u_l } } h_l d^*_k +
311 \displaystyle{\frac{\partial F_k }{\partial u_{l,j}} } h_{l,j} d^*_k +
312 \displaystyle{\frac{\partial G_{ki}}{\partial u_l} } h_l d^*_{k,i} +
313 \displaystyle{\frac{\partial G_{ki}}{\partial u_{l,j}} } h_{l,j} d^*_{k,i}
314 \; dx
315 = \int_{\Omega}
316 \displaystyle{\frac{\partial K }{\partial u_k } } h_k +
317 \displaystyle{\frac{\partial K }{\partial u_{k,i} } } h_{k,i} \; dx
318 \end{equation}
319 for all increments $h_k$ to the physical property $p$. This problem
320 is solved using \escript \class{LinearPDE} class. In the \escript notation we need to provide
321 \begin{equation}\label{ref:EQU:REG:600b}
322 \begin{array}{rcl}
323 A_{kilj} & = & \displaystyle{\frac{\partial G_{lj}}{\partial u_{k,i}} } \\
324 B_{kil} & = & \displaystyle{\frac{\partial F_l }{\partial u_{k,i}} } \\
325 C_{klj} & = & \displaystyle{\frac{\partial G_{lj}}{\partial u_k} } \\
326 D_{kl} & = & \displaystyle{\frac{\partial F_l }{\partial u_k } } \\
327 Y_{k} & = & \displaystyle{\frac{\partial K }{\partial u_k } } \\
328 X_{ki} & = & \displaystyle{\frac{\partial K }{\partial u_{k,i} } } \\
329 \end{array}
330 \end{equation}
331 Notice that these coefficient are transposed to the coefficients used to solve for the
332 state variables in equation~\ref{REF:EQU:costfunction 100}.
333
334 Setting $h_l=d_l$ in equation~\ref{REF:EQU:costfunction 100d} and
335 $r_k=d^*_k$ in equation~\ref{REF:EQU:costfunction 100b} one gets
336 \begin{equation}\label{ref:EQU:costfunction:601}
337 \int_{\Omega}
338 \displaystyle{\frac{\partial K }{\partial u_k } } d_k +
339 \displaystyle{\frac{\partial K }{\partial u_{k,i} } } d_{k,i}+
340 \displaystyle{\frac{\partial F_k }{\partial p} } q d^*_k +
341 \displaystyle{\frac{\partial G_{ki}}{\partial p} } q d^*_{k,i}
342 \; dx = 0
343 \end{equation}
344 which is inserted into equation~\ref{REF:EQU:costfunction 100d} to get
345 \begin{equation}\label{REF:EQU:costfunction 602}
346 <q, \nabla_{p} J(p)> = \int_{\Omega} \left(
347 \displaystyle{\frac{\partial K }{\partial p } } - \displaystyle{\frac{\partial F_k }{\partial p} } d^*_k
348 - \displaystyle{\frac{\partial G_{ki}}{\partial p} } d^*_{k,i} \right) q \; dx
349 \end{equation}
350 We need in fact the gradient of $J^f$ with respect to the level set function which is given as
351 \begin{equation}\label{REF:EQU:costfunction 603}
352 <n, \nabla J^f> = \int_{\Omega} \left(
353 \displaystyle{\frac{\partial K^f}{\partial p^f} } - \displaystyle{\frac{\partial F^f_k }{\partial p^f} } d^{f*}_k
354 - \displaystyle{\frac{\partial G^f_{ki}}{\partial p^f} } d^{f*}_{k,i} \right)
355 \cdot \displaystyle{\frac{\partial p^f }{\partial m_l} } n_l \; dx
356 \end{equation}
357 for any increment $n$ to the level set function. So in summary we get
358 \begin{equation}\label{ref:EQU:CS:101b}
359 \begin{array}{rcl}
360 Y_k & = & \displaystyle{\frac{\partial K^{reg}}{\partial m_k}} +
361 \sum_{f} \mu^{data}_{f} \left(
362 \displaystyle{\frac{\partial K^f}{\partial p^f} } - \displaystyle{\frac{\partial F^f_l }{\partial p^f} } d^{f*}_l
363 - \displaystyle{\frac{\partial G^f_{li}}{\partial p^f} } d^{f*}_{l,i} \right)
364 \cdot \displaystyle{\frac{\partial p^f }{\partial m_k} }
365
366 \\
367 X_{ki} & = & \displaystyle{\frac{\partial K^{reg}}{\partial m_{k,i}}}
368 \end{array}
369 \end{equation}
370 to represent $\nabla J$ as the tuple $(Y,X)$. Contributions of the forward model to the
371 Hessian operator are ignored.
372

  ViewVC Help
Powered by ViewVC 1.1.26