/[escript]/trunk/doc/inversion/CostFunctions.tex
ViewVC logotype

Contents of /trunk/doc/inversion/CostFunctions.tex

Parent Directory Parent Directory | Revision Log Revision Log


Revision 4147 - (show annotations)
Fri Jan 18 03:35:15 2013 UTC (6 years, 5 months ago) by gross
File MIME type: application/x-tex
File size: 20810 byte(s)
added some words in the overview and clean-up of bib.
1 \chapter{Cost Function}\label{chapter:ref:inversion cost function}
2 The general form of the cost function minimized in the inversion is given in the form (see also Chapter~\ref{chapter:ref:Drivers})
3 \begin{equation}\label{REF:EQU:DRIVE:10}
4 J(m) = J^{reg}(m) + \sum_{f} \mu^{data}_{f} \cdot J^{f}(p^f)
5 \end{equation}
6 where $m$ represents the level set function, $J^{reg}$ is the regularization term, see Chapter~\ref{Chp:ref:regularization},
7 and $J^{f}$ are a set of forward problems, see Chapter~\ref{Chp:ref:forward models} depending of
8 physical parameters $p^f$. The physical parameters $p^f$ are known functions
9 of the level set function $m$ which is the unknown to be calculated by the optimization process.
10 $\mu^{data}_{f}$ are trade-off factors. It is pointed out that the regularization term includes additional trade-off factors
11 The \class{InversionCostFunction} is class to define cost functions of an inversion. It is pointed out that
12 the \class{InversionCostFunction} class implements the \class{CostFunction} template class, see Chapter~\ref{chapter:ref:Minimization}.
13
14 In the simplest case there is a single forward model using a single physical parameter which is
15 derived form single-values level set function. The following script snippet shows the creation of the
16 \class{InversionCostFunction} for the case of a gravity inversion:
17 \begin{verbatim}
18 p=DensityMapping(...)
19 f=GravityModel(...)
20 J=InversionCostFunction(Regularization(...), \
21 mappings=p, \
22 forward_models=f)
23 \end{verbatim}
24 The argument \verb|...| refers to an appropriate argument list.
25
26 If two forward models are coming into play using two different physical parameters
27 the \member{mappings} and \member{forward_models} are defined as lists in the following form:
28 \begin{verbatim}
29 p_rho=DensityMapping(...)
30 p_k=SusceptibilityMapping(...)
31 f_mag=MagneticModel(...)
32 f_grav=GravityModel(...)
33
34 J=InversionCostFunction(Regularization(...), \
35 mappings=[p_rho, p_k], \
36 forward_models=[(f_mag, 1), (f_grav,0)])
37 \end{verbatim}
38 Here we define a joint inversion of gravity and magnetic data. \member{forward_models} is given as a list of
39 a tuple of a forward model and an index which referring to parameter in the \member{mappings} list to be used as an input.
40 The magnetic forward model \member{f_mag} is using the the second parameter (=\member{p_k}) in \member{mappings} list.
41 In this case the physical parameters are defined by a single-valued level set function. It is also possible
42 to link physical parameters to components of a level set function:
43 \begin{verbatim}
44 p_rho=DensityMapping(...)
45 p_k=SusceptibilityMapping(...)
46 f_mag=MagneticModel(...)
47 f_grav=GravityModel(...)
48
49 J=InversionCostFunction(Regularization(numLevelSets=2,...), \
50 mappings=[(p_rho,0), (p_k,1)], \
51 forward_models=[[(f_mag, 1), (f_grav,0)])
52 \end{verbatim}
53 The \member{mappings} argument is now a list of pairs where the first pair entry specifies the parameter mapping and
54 the second pair entry specifies the index of the component of the level set function to be used to evaluate the parameter.
55 In this case the level set function has two components, where the density mapping uses the first component of the level set function
56 while the susceptibility mapping uses the second component.
57
58 \section{\class{InversionCostFunction} API}\label{chapter:ref:inversion cost function:api}
59
60 The \class{InversionCostFunction} implements a \class{CostFunction} class used to run optimization solvers,
61 see section~\ref{chapter:ref:Minimization: costfunction class}.Its API is defined as follows:
62
63 \begin{classdesc}{InversionCostFunction}{regularization, mappings, forward_models}
64 Constructor for the inversion cost function. \member{regularization} sets the regularization to be used, see Chapter~\ref{Chp:ref:regularization}.
65 \member{mappings} is a list of pairs where each pair comprises of a
66 physical parameter mapping (see Chapter~\ref{Chp:ref:mapping}) and an index which refers to the component of level set function
67 defined by the \member{regularization} to be used to calculate the corresponding physical parameter. If
68 the level set function has a single component the index can be omitted. If in addition there is a single physical parameter
69 the mapping can be given instead of a list. \member{forward_models} is a list of pairs where the
70 first pair component is a forward model ( see Chapter~\ref{Chp:ref:forward models}) and the second
71 pair component referring to the physical parameter in the \member{mappings} list providing the physical parameter for the model.
72 If a single physical parameter is present the index can be omitted. If in addition a single forward model is used this
73 forward model can be assigned to \member{forward_models} in replacement of a list.
74 \end{classdesc}
75
76
77 \begin{methoddesc}[InversionCostFunction]{getDomain}{}
78 returns the \escript domain of the inversion, see~\cite{ESCRIPT}.
79 \end{methoddesc}
80
81 \begin{methoddesc}[InversionCostFunction]{getNumTradeOffFactors}{}
82 returns the total number of trade-off factors. The count includes the trade-off factors $\mu^{data}_{f}$ for
83 the forward models and (hidden) trade-off fatcors in the regularization term, see~definition\ref{REF:EQU:DRIVE:10}.
84 \end{methoddesc}
85
86 \begin{methoddesc}[InversionCostFunction]{getForwardModel}{\optional{idx=\None}}
87 returns the forward model with index \member{idx}. If the cost function contains
88 on model only argument \member{idx} can be omitted.
89 \end{methoddesc}
90
91 \begin{methoddesc}[InversionCostFunction]{getRegularization}{}
92 returns the regularization component of the cost function, see \class{regularization} in Chapter~\ref{Chp:ref:regularization}.
93 \end{methoddesc}
94
95
96 \begin{methoddesc}[InversionCostFunction]{setTradeOffFactorsModels}{\optional{mu=\None}}
97 sets the trade-off factors $\mu^{data}_{f}$ for the forward model components.
98 If a single model is present \member{mu} must be a floating point numbers. Otherwise
99 \member{mu} must be a list of floating point numbers. It is assumed that all
100 numbers are positive. Default values for the trade-off factors is one.
101 \end{methoddesc}
102
103 \begin{methoddesc}[InversionCostFunction]{getTradeOffFactorsModels}{}
104 returns the values of the the trade-off factors $\mu^{data}_{f}$ for the forward model components.
105 \end{methoddesc}
106
107
108 \begin{methoddesc}[InversionCostFunction]{setTradeOffFactorsRegularization}{\optional{mu=\None}, \optional{mu_c=\None}}
109 sets the trade of factors for the regularization component of the cost function.
110 for details. \member{mu} defines the trade-off factors for the level-set variation part and
111 \member{mu_c} sets the trade-off factors for the cross-gradient variation part.
112 This method is a short-cut for calling \member{setTradeOffFactorsForVariation} and
113 \member{setTradeOffFactorsForCrossGradient} for the underlying the regularization.
114 Please see \class{Regularization} in Chapter~\ref{Chp:ref:regularization} for more details
115 on the arguments \member{mu} and \member{mu_c}.
116 \end{methoddesc}
117
118 \begin{methoddesc}[InversionCostFunction]{setTradeOffFactors}{\optional{mu=\None}}
119 sets the trade-off factors for the forward model and regularization
120 terms. \member{mu} is a list of positive floats. The length of the list
121 is total number of trade-off factors given by the method \method{getNumTradeOffFactors}. The
122 first part of \member{mu} define the trade-off factors $\mu^{data}_{f}$ for the forward model components
123 while the remaining entries define the trade-off factors for the regularization components of the
124 cost function. By default all values are set to one.
125 \end{methoddesc}
126
127 \begin{methoddesc}[InversionCostFunction]{getProperties}{m}
128 returns the physical properties from a given level set function \member{m}
129 using the mappings of the cost function. The physical properties are
130 returned in the order in which they are given in the \member{mappings} argument
131 in the class constructor.
132 \end{methoddesc}
133
134 \begin{methoddesc}[InversionCostFunction]{createLevelSetFunction}{*props}
135 returns the level set function corresponding to set of given physical properties.
136 This method is the inverse of the \method{getProperties} method.
137 The arguments \member{props} define a tuple of values for the physical properties
138 where the order needs to correspond to the order in which the physical property mappings
139 are given in the \member{mappings} argument
140 in the class constructor. If a value for a physical property
141 is given as \None the corresponding component of the returned level set function is set to zero.
142 If no physical properties are given all components of the level set function are set to zero.
143 \end{methoddesc}
144
145 \begin{methoddesc}[InversionCostFunction]{getNorm}{m}
146 returns the norm of a level set function \member{m} as a floating point number.
147 \end{methoddesc}
148
149 \begin{methoddesc}[InversionCostFunction]{getArguments}{m}
150 returns pre-computed values for the evaluation of the
151 cost function and its gradient for a given value \member{m}
152 of the level set function. In essence the method collects
153 pre-computed values for the underlying regularization and forward models\footnote{Using pre-computed
154 values can significant speed up the optimization process when the value
155 of the cost function and it gradient for the same same leve set function
156 are needed.}.
157 \end{methoddesc}
158
159 \begin{methoddesc}[InversionCostFunction]{getValue}{m \optional{, *args}}
160 returns the value of the cost function for a given level set function \member{m}
161 and corresponding pre-computed values \member{args}. If no pre-computed values are present
162 \member{getArguments} is called.
163 \end{methoddesc}
164
165 \begin{methoddesc}[InversionCostFunction]{getGradient}{m \optional{, *args}}
166 returns the gradient of the cost function at level set function \member{m}
167 using the corresponding pre-computed values \member{args}. If no pre-computed values are present
168 \member{getArguments} is called. The gradient
169 is represented as a tuple $(Y,X)$ where in essence
170 $Y$ represents the derivative of the cost function kernel with respect to the
171 level set function and $X$ represents thederivative of the cost function kernel with respect to the gradient of the
172 level set function, see Section~\ref{chapter:ref:inversion cost function:gradient} for more details.
173 \end{methoddesc}
174
175 \begin{methoddesc}[InversionCostFunction]{getDualProduct}{m, g}
176 return the dual product of a level set function \member{m}
177 with a gradient \member{g}, see Section~\ref{chapter:ref:inversion cost function:gradient} for more details.
178 This method uses the dual product of the regularization.
179 \end{methoddesc}
180
181 \begin{methoddesc}[InversionCostFunction]{getInverseHessianApproximation}{m, g \optional{, *args}}
182 returns an approximative evaluation of the inverse of the Hessian operator of the cost function
183 for a given gradient \member{g} at a given level set function \member{m}
184 using the corresponding pre-computed values \member{args}. If no pre-computed values are present
185 \member{getArguments} is called. In the current implementation
186 contributions to Hessian operator from the forward models are ignored and only contributions
187 from the regularization and cross-gradient term.
188 \end{methoddesc}
189
190
191 \section{Gradient calculation}\label{chapter:ref:inversion cost function:gradient}
192 In this section we briefly discuss the calculation of the gradient and the Hessian operator.
193 If $\nabla$ denotes the gradient operator (with respect to the level set function $m$)
194 the gradient of $J$ is given as
195 \begin{equation}\label{REF:EQU:DRIVE:10b}
196 \nabla J(m) = \nabla J^{reg}(m) + \sum_{f} \mu^{data}_{f} \cdot \nabla J^{f}(p^f) \; .
197 \end{equation}
198 We first focus on the calculation of $\nabla J^{reg}$. In fact the
199 regularization cost function $J^{reg}$ is given through a cost function
200 kernel\index{cost function!kernel} $K^{reg}$ in the form
201 \begin{equation}\label{REF:EQU:INTRO 2a}
202 J^{reg}(m) = \int_{\Omega} K^{reg} \; dx
203 \end{equation}
204 where $K^{reg}$ is a given function of the
205 level set function $m_k$ and its spatial derivative $m_{k,i}$. If $n$ is an increment to the level set function
206 then the directional derivative of $J^{ref}$ in the direction of $n$ is given as
207 \begin{equation}\label{REF:EQU:INTRO 2aa}
208 <n, \nabla J^{reg}(m)> = \int_{\Omega} \frac{ \partial K^{reg}}{\partial m_k} n_k + \frac{ \partial K^{reg}}{\partial m_{k,i}} n_{k,i} \; dx
209 \end{equation}
210 where $<.,.>$ denotes the dual product, see Chapter~\ref{chapter:ref:Minimization}. Consequently, the gradient $\nabla J^{reg}$
211 can be represented by a pair of values $Y$ and $X$
212 \begin{equation}\label{ref:EQU:CS:101}
213 \begin{array}{rcl}
214 Y_k & = & \displaystyle{\frac{\partial K^{reg}}{\partial m_k}} \\
215 X_{ki} & = & \displaystyle{\frac{\partial K^{reg}}{\partial m_{k,i}}}
216 \end{array}
217 \end{equation}
218 while the dual product $<.,.>$ of a level set increment $n$ and a gradient increment $g=(Y,X)$ is given as
219 \begin{equation}\label{REF:EQU:INTRO 2aaa}
220 <n,g> = \int_{\Omega} Y_k n_k + X_{ki} n_{k,i} \; dx
221 \end{equation}
222 We also need to provide (an approximation of) the value $p$ of the inverse of the Hessian operator $\nabla \nabla J$
223 for a given gradient increment $g=(Y,X)$. This means we need to (approximatively) solve the variational problem
224 \begin{equation}\label{REF:EQU:INTRO 2b}
225 <n,\nabla \nabla J p > = \int_{\Omega} Y_k n_k + X_{ki} n_{k,i} \; dx
226 \end{equation}
227 for all increments $n$ of the level set function. If we ignore contributions
228 from the forward models the left hand side takes the form
229 \begin{equation}\label{REF:EQU:INTRO 2c}
230 <n,\nabla \nabla J^{reg} p > = \int_{\Omega}
231 \displaystyle{\frac{\partial Y_k}{\partial m_l}} p_l n_k +
232 \displaystyle{\frac{\partial Y_k}{\partial m_{l,j}}} p_{l,j} n_k +
233 \displaystyle{\frac{\partial X_{ki}}{\partial m_l}} p_l n_{k,i} +
234 \displaystyle{\frac{\partial X_{ki}}{\partial m_{l,j}}} p_{l,j} n_{k,i}
235 \; dx
236 \end{equation} We follow the concept as outlined in section~\ref{chapter:ref:inversion cost function:gradient}.
237 Notice that equation~\ref{REF:EQU:INTRO 2b} defines a system of linear PDEs
238 which is solved using \escript \class{LinearPDE} class. In the \escript notation we need to provide
239 \begin{equation}\label{ref:EQU:REG:600}
240 \begin{array}{rcl}
241 A_{kilj} & = & \displaystyle{\frac{\partial X_{ki}}{\partial m_{l,j}}} \\
242 B_{kil} & = & \displaystyle{\frac{\partial X_{ki}}{\partial m_l}} \\
243 C_{klj} & = & \displaystyle{\frac{\partial Y_k}{\partial m_{l,j}}} \\
244 D_{kl} & = & \displaystyle{\frac{\partial Y_k}{\partial m_l}} \\
245 \end{array}
246 \end{equation}
247 The calculation of the gradient of the forward model component is more complicated:
248 the data defect $J^{f}$ for forward model $f$ is expressed use a cost function kernel $K^{f}$
249 \begin{equation}\label{REF:EQU:INTRO 2bb}
250 J^{f}(p^f) = \int_{\Omega} K^{f} \; dx
251 \end{equation}
252 In this case the cost function kernel $K^{f}$ is a function of the
253 physical parameter $p^f$, which again is a function of the level-set function,
254 and the state variable $u^f_{k}$ and its gradient $u^f_{k,i}$. For the sake of a simpler
255 presentation the upper index $f$ is dropped.
256
257 The gradient $\nabla_{p} J$ of the $J$ with respect to
258 the physical property $p$ is given as
259 \begin{equation}\label{REF:EQU:costfunction 100b}
260 <q, \nabla_{p} J(p)> = \int_{\Omega}
261 \displaystyle{\frac{\partial K }{\partial u_k } } \displaystyle{\frac{\partial u_k }{\partial q } } +
262 \displaystyle{\frac{\partial K }{\partial u_{k,i} } } \left( \displaystyle{\frac{\partial u_k }{\partial q } } \right)_{,i}+
263 \displaystyle{\frac{\partial K }{\partial p } } q \; dx
264 \end{equation}
265 for any $q$ as an increment to the physical parameter $p$. If the change
266 of the state variable
267 $u_f$ for physical parameter $p$ in the direction of $q$ is denoted as
268 \begin{equation}\label{REF:EQU:costfunction 100c}
269 d_k =\displaystyle{\frac{\partial u_k }{\partial q } }
270 \end{equation}
271 equation~\ref{REF:EQU:costfunction 100b} can be written as
272 \begin{equation}\label{REF:EQU:costfunction 100d}
273 <q, \nabla_{p} J(p)> = \int_{\Omega}
274 \displaystyle{\frac{\partial K }{\partial u_k } } d_k +
275 \displaystyle{\frac{\partial K }{\partial u_{k,i} } } d_{k,i}+
276 \displaystyle{\frac{\partial K }{\partial p } } q \; dx
277 \end{equation}
278 The state variable are the solution of PDE which in variational from is given
279 \begin{equation}\label{REF:EQU:costfunction 100}
280 \int_{\Omega} F_k \cdot r_k + G_{li} \cdot r_{k,i} \; dx = 0
281 \end{equation}
282 for all increments $r$ to the stat $u$. The functions $F$ and $G$ are given and describe the physical
283 model. They depend of the state variable $u_{k}$ and its gradient $u_{k,i}$ and the physical parameter $p$. The change
284 $d_k$ of the state
285 $u_f$ for physical parameter $p$ in the direction of $q$ is given from the equation
286 \begin{equation}\label{REF:EQU:costfunction 100bb}
287 \int_{\Omega}
288 \displaystyle{\frac{\partial F_k }{\partial u_l } } d_l r_k +
289 \displaystyle{\frac{\partial F_k }{\partial u_{l,j}} } d_{l,j} r_k +
290 \displaystyle{\frac{\partial F_k }{\partial p} }q r_k +
291 \displaystyle{\frac{\partial G_{ki}}{\partial u_l} } d_l r_{k,i} +
292 \displaystyle{\frac{\partial G_{ki}}{\partial u_{l,j}} } d_{l,j} r_{k,i}+
293 \displaystyle{\frac{\partial G_{ki}}{\partial p} } q r_{k,i}
294 \; dx = 0
295 \end{equation}
296 to be fulfilled for all functions $r$. Now let $d^*_k$ be the solution of the
297 variational equation
298 \begin{equation}\label{REF:EQU:costfunction 100dd}
299 \int_{\Omega}
300 \displaystyle{\frac{\partial F_k }{\partial u_l } } h_l d^*_k +
301 \displaystyle{\frac{\partial F_k }{\partial u_{l,j}} } h_{l,j} d^*_k +
302 \displaystyle{\frac{\partial G_{ki}}{\partial u_l} } h_l d^*_{k,i} +
303 \displaystyle{\frac{\partial G_{ki}}{\partial u_{l,j}} } h_{l,j} d^*_{k,i}
304 \; dx
305 = \int_{\Omega}
306 \displaystyle{\frac{\partial K }{\partial u_k } } h_k +
307 \displaystyle{\frac{\partial K }{\partial u_{k,i} } } h_{k,i} \; dx
308 \end{equation}
309 for all increments $h_k$ to the physical property $p$. This problem
310 is solved using \escript \class{LinearPDE} class. In the \escript notation we need to provide
311 \begin{equation}\label{ref:EQU:REG:600b}
312 \begin{array}{rcl}
313 A_{kilj} & = & \displaystyle{\frac{\partial G_{lj}}{\partial u_{k,i}} } \\
314 B_{kil} & = & \displaystyle{\frac{\partial F_l }{\partial u_{k,i}} } \\
315 C_{klj} & = & \displaystyle{\frac{\partial G_{lj}}{\partial u_k} } \\
316 D_{kl} & = & \displaystyle{\frac{\partial F_l }{\partial u_k } } \\
317 Y_{k} & = & \displaystyle{\frac{\partial K }{\partial u_k } } \\
318 X_{ki} & = & \displaystyle{\frac{\partial K }{\partial u_{k,i} } } \\
319 \end{array}
320 \end{equation}
321 Notice that these coefficient are transposed to the coefficients used to solve for the
322 state variables in equation~\ref{REF:EQU:costfunction 100}.
323
324 Setting $h_l=d_l$ in equation~\ref{REF:EQU:costfunction 100d} and
325 $r_k=d^*_k$ in equation~\ref{REF:EQU:costfunction 100b} one gets
326 \begin{equation}\label{ref:EQU:costfunction:601}
327 \int_{\Omega}
328 \displaystyle{\frac{\partial K }{\partial u_k } } d_k +
329 \displaystyle{\frac{\partial K }{\partial u_{k,i} } } d_{k,i}+
330 \displaystyle{\frac{\partial F_k }{\partial p} } q d^*_k +
331 \displaystyle{\frac{\partial G_{ki}}{\partial p} } q d^*_{k,i}
332 \; dx = 0
333 \end{equation}
334 which is inserted into equation~\ref{REF:EQU:costfunction 100d} to get
335 \begin{equation}\label{REF:EQU:costfunction 602}
336 <q, \nabla_{p} J(p)> = \int_{\Omega} \left(
337 \displaystyle{\frac{\partial K }{\partial p } } - \displaystyle{\frac{\partial F_k }{\partial p} } d^*_k
338 - \displaystyle{\frac{\partial G_{ki}}{\partial p} } d^*_{k,i} \right) q \; dx
339 \end{equation}
340 We need in fact the gradient of $J^f$ with respect to the level set function which is given as
341 \begin{equation}\label{REF:EQU:costfunction 603}
342 <n, \nabla J^f> = \int_{\Omega} \left(
343 \displaystyle{\frac{\partial K^f}{\partial p^f} } - \displaystyle{\frac{\partial F^f_k }{\partial p^f} } d^{f*}_k
344 - \displaystyle{\frac{\partial G^f_{ki}}{\partial p^f} } d^{f*}_{k,i} \right)
345 \cdot \displaystyle{\frac{\partial p^f }{\partial m_l} } n_l \; dx
346 \end{equation}
347 for any increment $n$ to the level set function. So in summary we get
348 \begin{equation}\label{ref:EQU:CS:101b}
349 \begin{array}{rcl}
350 Y_k & = & \displaystyle{\frac{\partial K^{reg}}{\partial m_k}} +
351 \sum_{f} \mu^{data}_{f} \left(
352 \displaystyle{\frac{\partial K^f}{\partial p^f} } - \displaystyle{\frac{\partial F^f_l }{\partial p^f} } d^{f*}_l
353 - \displaystyle{\frac{\partial G^f_{li}}{\partial p^f} } d^{f*}_{l,i} \right)
354 \cdot \displaystyle{\frac{\partial p^f }{\partial m_k} }
355
356 \\
357 X_{ki} & = & \displaystyle{\frac{\partial K^{reg}}{\partial m_{k,i}}}
358 \end{array}
359 \end{equation}
360 to represent $\nabla J$ as the tuple $(Y,X)$. Contributions of the forward model to the
361 Hessian operator are ignored.
362

  ViewVC Help
Powered by ViewVC 1.1.26