/[escript]/trunk/doc/inversion/DataSources.tex
ViewVC logotype

Contents of /trunk/doc/inversion/DataSources.tex

Parent Directory Parent Directory | Revision Log Revision Log


Revision 4178 - (show annotations)
Thu Jan 31 03:18:56 2013 UTC (6 years, 6 months ago) by caltinay
File MIME type: application/x-tex
File size: 15499 byte(s)
Some more doco.

1 \chapter{Data Sources}\label{Chp:ref:data sources}
2
3 At the source of every inversion is data in the form of gravity anomaly or
4 magnetic flux density values for at least a part of the region of interest.
5 These usually come from surveys and are preprocessed to correct for various
6 factors and distortions.
7 This chapter provides an overview of the classes related to data input for
8 inversions.
9
10 \section{Overview}
11 The inversion module comes with a number of classes that can read gridded
12 (raster) data on a 2-dimensional plane from file or provide artificial values
13 for testing purposes. These classes all derive from the abstract
14 \class{DataSource} class and override methods that return information about
15 the data and the values themselves.
16 The \class{DomainBuilder} class is responsible for creating an \escript domain
17 with a suitable grid spacing and spatial extents that include all data sources
18 attached to it (see Figure~\ref{fig:domainBuilder}).
19 %
20 \begin{figure}[ht]
21 \centering\includegraphics{domainbuilder}
22 \caption{\class{DataSource} instances are added to a \class{DomainBuilder}
23 which creates a suitable domain and \Data objects for the inversion}
24 \label{fig:domainBuilder}
25 \end{figure}
26 %
27 Notice that in the figure there are cells in the region of interest that are
28 not covered by any data source instance.
29 Ideally, all data sources used for an inversion have the same spatial resolution
30 and are spatially adjacent so that all cells have a value but this is not a
31 requirement.
32
33
34 \section{Domain Builder}\label{Chp:ref:domain builder}
35 Every inversion requires one \class{DomainBuilder} instance which creates and
36 holds a reference to the \escript domain as well as associated \Data objects for
37 the input data used for the inversion.
38 The class has the following public methods:
39
40 \begin{classdesc}{DomainBuilder}{\optional{dim=3}}
41 Constructor for the domain builder. \member{dim} sets the dimensionality of the
42 target domain and must be 2 or 3. By default a 3-dimensional domain is created.
43 \end{classdesc}
44
45 \begin{methoddesc}[DomainBuilder]{addSource}{source}
46 adds survey data \member{source} (a \class{DataSource} object) to the domain
47 builder. The dimensionality of the data must be less than or equal to the
48 domain dimensionality.
49 \end{methoddesc}
50
51 \begin{methoddesc}[DomainBuilder]{setVerticalExtents}{%
52 \optional{depth=40000.}%
53 \optional{, air_layer=10000.}%
54 \optional{, num_cells=25}}
55 sets the parameters for the vertical dimension of the domain. The parameter
56 \member{depth} specifies the thickness in meters of the subsurface layer
57 ($-x_2^{min}$ in Figure~\ref{fig:cartesianDomain}).
58 The default value of $40$ km is usually appropriate. Similarly, the
59 \member{air_layer} parameter defines the buffer zone thickness above the surface
60 ($x_2^{max}$ in Figure~\ref{fig:cartesianDomain}) which should be a few
61 kilometres to avoid artefacts in the inversion.
62 The number of elements (or cells) in the vertical dimension is set with the
63 \member{num_cells} parameter. Consider the size and resolution of your datasets,
64 the total vertical length (=\member{depth}+\member{air_layer}) and available
65 compute resources when setting this value.
66 \end{methoddesc}
67
68 \begin{methoddesc}[DomainBuilder]{setFractionalPadding}{%
69 \optional{pad_x=\None}%
70 \optional{, pad_y=\None}}
71 sets the amount of padding around the dataset as a fraction of the dataset side
72 lengths.
73 For example, calling \member{setFractionalPadding(0.2, 0.1)} with a data source
74 of size $10 \times 20$ will result in the padded data set size $14 \times 24$
75 (that is $10 \times (1+2 \times 0.2)$ and $20 \times (1+2 \times 0.1)$).
76 By default no padding is applied and \member{pad_y} is ignored for 2-dimensional
77 domains.
78 \end{methoddesc}
79
80 \begin{methoddesc}[DomainBuilder]{setPadding}{%
81 \optional{pad_x=\None}%
82 \optional{, pad_y=\None}}
83 sets the amount of padding around the dataset in absolute length units.
84 The final domain size will be the length in x (in y) of the dataset plus twice
85 the value of \member{pad_x} (\member{pad_y}). The arguments must be non-negative.
86 By default no padding is applied and \member{pad_y} is ignored for 2-dimensional
87 domains.
88 \end{methoddesc}
89
90 \begin{methoddesc}[DomainBuilder]{setElementPadding}{%
91 \optional{pad_x=\None}%
92 \optional{, pad_y=\None}}
93 sets the amount of padding around the dataset in number of elements (cells).
94 When the domain is constructed \member{pad_x} (\member{pad_y}) elements are
95 added on each side of the x- (y-) dimension. The arguments must be non-negative.
96 By default no padding is applied and \member{pad_y} is ignored for 2-dimensional
97 domains.
98 \end{methoddesc}
99
100 \begin{methoddesc}[DomainBuilder]{fixDensityBelow}{%
101 \optional{depth=\None}}
102 defines the depth below which the density anomaly is fixed to zero.
103 This method is only useful for inversions that involve gravity data.
104 \end{methoddesc}
105
106 \begin{methoddesc}[DomainBuilder]{fixSusceptibilityBelow}{%
107 \optional{depth=\None}}
108 defines the depth below which the susceptibility anomaly is fixed to zero.
109 This method is only useful for inversions that involve magnetic data.
110 \end{methoddesc}
111
112 \begin{methoddesc}[DomainBuilder]{getGravitySurveys}{}
113 returns a list of all gravity surveys added to the domain builder. See
114 \member{getSurveys()} for more details.
115 \end{methoddesc}
116
117 \begin{methoddesc}[DomainBuilder]{getMagneticSurveys}{}
118 returns a list of all magnetic surveys added to the domain builder. See
119 \member{getSurveys()} for more details.
120 \end{methoddesc}
121
122 \begin{methoddesc}[DomainBuilder]{getSurveys}{datatype}
123 returns a list of surveys of type \member{datatype} available to this domain
124 builder. In the current implementation each survey is a tuple of two \Data
125 objects, the first containing anomaly values and the second standard error
126 values for the survey.
127 \end{methoddesc}
128
129 \begin{methoddesc}[DomainBuilder]{getDomain}{}
130 returns an \escript domain (see~\cite{ESCRIPT}) suitable for running inversions
131 on the attached data sources.
132 The first time this method is called the target parameters (such as resolution,
133 extents and number of elements) are computed, and the domain is created.
134 Subsequent calls return the same domain instance so calls to
135 \member{setPadding()}, \member{addSource()} and other methods that influence
136 the domain will fail once \member{getDomain()} is called the first time.
137 \end{methoddesc}
138
139 \begin{methoddesc}[DomainBuilder]{setBackgroundMagneticFluxDensity}{B}
140 sets the background magnetic flux density $B=(B_r,B_\theta,B_\phi)$ which is
141 required for magnetic inversions.
142 A implementation of the dipole approximation as described in
143 Equation~\ref{ref:MAG:EQU:5} is provided through the function
144 \member{simpleGeoMagneticFluxDensity} (see Section~\ref{sec:ref:DataSource}).
145 $B_\theta$ is ignored for 2-dimensional magnetic inversions.
146 \end{methoddesc}
147
148 \begin{methoddesc}[DomainBuilder]{getBackgroundMagneticFluxDensity}{}
149 returns the background magnetic flux density $B$ set via
150 \member{setBackgroundMagneticFluxDensity()} in a form suitable for the inversion.
151 There should be no need to call this method directly.
152 \end{methoddesc}
153
154 \begin{methoddesc}[DomainBuilder]{getSetDensityMask}{}
155 returns the density mask \Data object which is non-zero for cells that have a
156 fixed density value, zero otherwise.
157 There should be no need to call this method directly.
158 \end{methoddesc}
159
160 \begin{methoddesc}[DomainBuilder]{getSetSusceptibilityMask}{}
161 returns the susceptibility mask \Data object which is non-zero for cells that
162 have a fixed susceptibility value, zero otherwise.
163 There should be no need to call this method directly.
164 \end{methoddesc}
165
166 \section{\class{DataSource} Class}\label{sec:ref:DataSource}
167
168 Data sources added to a \class{DomainBuilder} must provide an implementation for
169 a few methods as described in the class template \class{DataSource} from
170 the \module{esys.downunder.datasources} module:
171
172 \begin{classdesc}{DataSource}{}
173 Base constructor which initializes members and should therefore be invoked by
174 subclasses. Subclasses may then use the member \member{logger} to print any
175 output.
176 \end{classdesc}
177
178 \begin{methoddesc}[DataSource]{getDataExtents}{}
179 This method should be implemented to return a tuple of tuples
180 ( (x0, y0), (nx, ny), (dx, dy) ), where (x0, y0) denote the UTM coordinates of
181 the data origin, (nx, ny) the number of data points, and (dx, dy) the spacing
182 of data points.
183 \end{methoddesc}
184
185 \begin{methoddesc}[DataSource]{getDataType}{}
186 Subclasses must return \class{DataSource}.\member{GRAVITY} or
187 \class{DataSource}.\member{MAGNETIC} depending on the type of data they provide.
188 \end{methoddesc}
189
190 \begin{methoddesc}[DataSource]{getSurveyData}{domain, origin, NE, spacing}
191 This method is called by the \class{DomainBuilder} to retrieve the actual survey
192 data in the form of \Data objects on the \member{domain}.
193 Data sources are responsible to map or interpolate their data onto the domain
194 which has been constructed to fit the data.
195 The domain \member{origin}, number of elements \member{NE} and element
196 \member{spacing} are provided as tuples or lists to aid with interpolation.
197 \end{methoddesc}
198
199 \begin{methoddesc}[DataSource]{setSubsamplingFactor}{factor}
200 Notifies the data source that data should be subsampled by \member{factor}.
201 This method does not need to be overwritten.
202 See \member{getSubsamplingFactor()} for an explanation.
203 \end{methoddesc}
204
205 \begin{methoddesc}[DataSource]{getSubsamplingFactor}{}
206 Returns the subsampling factor which was set by \member{setSubsamplingFactor()}
207 or $1$ which indicates that no subsampling is requested.
208 Data sources that support subsampling (or interleaving) of their data should use
209 this method to query the subsampling factor before returning surveys via
210 \member{getSurveyData}. If supported, the factor should be applied in all
211 dimensions. For example, a 2-dimensional dataset with 300 x 150 data points
212 should be reduced to 150 x 75 data points when the subsampling factor equals $2$.
213 Subsampling becomes important when the survey data resolution is too fine or
214 when using data with varying resolution in one inversion.
215 Note that data sources may choose to ignore the subsampling factor if they
216 don't support it.
217 \end{methoddesc}
218
219 \vspace{1em}\noindent The \module{esys.downunder.datasources} module contains the following helper
220 functions:
221
222 \begin{funcdesc}{simpleGeoMagneticFluxDensity}{latitude%
223 \optional{, longitude=0.}}
224 returns an approximation of the geomagnetic flux density $B$ as described in
225 Equation~\ref{ref:MAG:EQU:5} for the given \member{latitude}.
226 The \member{longitude} parameter is currently ignored and the return value is
227 the tuple $(B_r, B_{\theta}, 0)$.
228 \end{funcdesc}
229
230 \begin{funcdesc}{LatLonToUTM}{longitude, latitude%
231 \optional{, wkt_string=\None}}
232 converts one or more (longitude,latitude) pairs to the corresponding (x,y)
233 coordinates in the \emph{Universal Transverse Mercator} (UTM) projection.
234 This function requires the \module{pyproj} module for conversion and the
235 \module{gdal} module to parse the \member{wkt_string} parameter if supplied.
236 \end{funcdesc}
237
238 \subsection{ER Mapper Raster Data}
239 \emph{ER Mapper} files that contain 2-dimensional raster data may be used for
240 inversions through the \class{ErMapperData} class which is derived from
241 \class{DataSource}.
242 Generally, these datasets contain two files, a header file and a data file.
243 The former usually has the \texttt{.ers} file extension and is a text file that
244 describes the data format, size, coordinate system used etc.
245 The data file usually has the same file name but no extension.
246 Note, that the current implementation may not work with all \emph{ER Mapper}
247 datasets. For example, the only cell type understood is \emph{IEEE4ByteReal}
248 at the moment.
249 To run inversions on a \emph{ER Mapper} dataset use the following constructor:
250 \begin{classdesc}{ErMapperData}{datatype, headerfile%
251 \optional{, datafile=\None}%
252 \optional{, altitude=0.}}
253 Creates a new data source from \emph{ER Mapper} data.
254 The parameter \member{datatype} must be one of
255 \class{DataSource}.\member{GRAVITY} or \class{DataSource}.\member{MAGNETIC}
256 depending on the type of data, \member{headerfile} is the name of the header
257 file, \member{datafile} specifies the name of the data file and
258 \member{altitude} specifies the altitude in meters of the measurements.
259 The parameter \member{datafile} can be left blank if the name is identical to
260 the header file except for the file extension. The \member{altitude} parameter
261 is only used with 3-dimensional domains and determines the vertical location
262 of the 2-dimensional slice of data within the domain.
263 \end{classdesc}
264
265 \subsection{NetCDF Data}
266 The \class{NetCdfData} class from the \module{esys.downunder.datasources} module
267 provides the means to use data from \netcdf files~\cite{netcdf} for inversion.
268 Currently, files that follow the \emph{Climate and Forecast (CF)}\footnote{%
269 \url{http://cf-pcmdi.llnl.gov/documents/cf-conventions/latest-cf-conventions-document-1}}
270 and/or the \emph{Cooperative Ocean/Atmosphere Research Data Service (COARDS)}\footnote{%
271 \url{http://ferret.wrc.noaa.gov/noaa_coop/coop_cdf_profile.html}} metadata
272 conventions are supported.
273 The example script \examplefile{create_netcdf.py} demonstrates how a compatible
274 file can be generated from within \python (provided the \module{scipy} module
275 is available).
276 To plot such an input file including coordinates and legend using
277 \emph{matplotlib}~\cite{matplotlib} see the script \examplefile{show_netcdf.py}.
278 The interface to \class{NetCdfData} looks as follows:
279
280 \begin{classdesc}{NetCdfData}{datatype, filename%
281 \optional{, altitude=0.}}
282 Creates a new data source from compatible \netcdf data.
283 The parameter \member{datatype} must be one of
284 \class{DataSource}.\member{GRAVITY} or \class{DataSource}.\member{MAGNETIC}
285 depending on the type of data, \member{filename} is the name of the file and
286 \member{altitude} specifies the altitude in meters of the measurements.
287 The \member{altitude} parameter is only used with 3-dimensional domains and
288 determines the vertical location of the 2-dimensional slice of data within the
289 domain.
290 \end{classdesc}
291
292 \subsection{Synthetic Data}
293 As a special case the \module{esys.downunder.datasources} module contains
294 classes to generate input data for inversions by solving a forward model with
295 user-defined reference data.
296 The main purpose of using synthetic data is to test the capabilities of the
297 inversion module or for tracking down problems.
298
299 The base class for synthetic data which is derived from \class{DataSource}
300 has the following interface:
301
302 \begin{classdesc}{SyntheticDataBase}{datatype%
303 \optional{, DIM=2}
304 \optional{, number_of_elements=10}
305 \optional{, length=1*U.km}
306 \optional{, B_b=\None}
307 \optional{, data_offset=0}
308 \optional{, full_knowledge=\False}
309 \optional{, spherical=\False}}
310 Base class to define reference data based on a given property distribution
311 (density or susceptibility). Data are collected from a square region of
312 vertical extent \member{length} on a grid with \member{number_of_elements}
313 cells in each direction.
314 The synthetic data are constructed by solving the appropriate forward problem.
315 Data can be sampled with an offset from the surface at $z=0$ or using the
316 entire subsurface region.
317 \end{classdesc}
318
319 \begin{methoddesc}[SyntheticDataBase]{getReferenceProperty}{
320 \optional{domain=\None}}
321 Returns the reference \Data object that was used to generate the gravity or
322 susceptibility anomaly data. The \member{domain} argument must be present
323 when this method is called for the first time but not necessarily in
324 subsequent calls.
325 \end{methoddesc}
326

  ViewVC Help
Powered by ViewVC 1.1.26