/[escript]/trunk/doc/inversion/DataSources.tex
ViewVC logotype

Annotation of /trunk/doc/inversion/DataSources.tex

Parent Directory Parent Directory | Revision Log Revision Log


Revision 4178 - (hide annotations)
Thu Jan 31 03:18:56 2013 UTC (6 years, 6 months ago) by caltinay
File MIME type: application/x-tex
File size: 15499 byte(s)
Some more doco.

1 gross 4106 \chapter{Data Sources}\label{Chp:ref:data sources}
2    
3 caltinay 4145 At the source of every inversion is data in the form of gravity anomaly or
4     magnetic flux density values for at least a part of the region of interest.
5     These usually come from surveys and are preprocessed to correct for various
6     factors and distortions.
7     This chapter provides an overview of the classes related to data input for
8     inversions.
9    
10 caltinay 4149 \section{Overview}
11 caltinay 4145 The inversion module comes with a number of classes that can read gridded
12     (raster) data on a 2-dimensional plane from file or provide artificial values
13     for testing purposes. These classes all derive from the abstract
14     \class{DataSource} class and override methods that return information about
15     the data and the values themselves.
16     The \class{DomainBuilder} class is responsible for creating an \escript domain
17     with a suitable grid spacing and spatial extents that include all data sources
18     attached to it (see Figure~\ref{fig:domainBuilder}).
19     %
20     \begin{figure}[ht]
21     \centering\includegraphics{domainbuilder}
22     \caption{\class{DataSource} instances are added to a \class{DomainBuilder}
23     which creates a suitable domain and \Data objects for the inversion}
24     \label{fig:domainBuilder}
25     \end{figure}
26     %
27     Notice that in the figure there are cells in the region of interest that are
28     not covered by any data source instance.
29     Ideally, all data sources used for an inversion have the same spatial resolution
30     and are spatially adjacent so that all cells have a value but this is not a
31     requirement.
32 gross 4106
33 caltinay 4145
34 gross 4131 \section{Domain Builder}\label{Chp:ref:domain builder}
35 caltinay 4149 Every inversion requires one \class{DomainBuilder} instance which creates and
36     holds a reference to the \escript domain as well as associated \Data objects for
37     the input data used for the inversion.
38     The class has the following public methods:
39 gross 4106
40 caltinay 4145 \begin{classdesc}{DomainBuilder}{\optional{dim=3}}
41     Constructor for the domain builder. \member{dim} sets the dimensionality of the
42     target domain and must be 2 or 3. By default a 3-dimensional domain is created.
43     \end{classdesc}
44 gross 4106
45 caltinay 4145 \begin{methoddesc}[DomainBuilder]{addSource}{source}
46 caltinay 4149 adds survey data \member{source} (a \class{DataSource} object) to the domain
47     builder. The dimensionality of the data must be less than or equal to the
48     domain dimensionality.
49 caltinay 4145 \end{methoddesc}
50    
51 caltinay 4149 \begin{methoddesc}[DomainBuilder]{setVerticalExtents}{%
52     \optional{depth=40000.}%
53     \optional{, air_layer=10000.}%
54     \optional{, num_cells=25}}
55     sets the parameters for the vertical dimension of the domain. The parameter
56     \member{depth} specifies the thickness in meters of the subsurface layer
57     ($-x_2^{min}$ in Figure~\ref{fig:cartesianDomain}).
58     The default value of $40$ km is usually appropriate. Similarly, the
59     \member{air_layer} parameter defines the buffer zone thickness above the surface
60     ($x_2^{max}$ in Figure~\ref{fig:cartesianDomain}) which should be a few
61     kilometres to avoid artefacts in the inversion.
62     The number of elements (or cells) in the vertical dimension is set with the
63     \member{num_cells} parameter. Consider the size and resolution of your datasets,
64     the total vertical length (=\member{depth}+\member{air_layer}) and available
65     compute resources when setting this value.
66     \end{methoddesc}
67    
68     \begin{methoddesc}[DomainBuilder]{setFractionalPadding}{%
69     \optional{pad_x=\None}%
70     \optional{, pad_y=\None}}
71     sets the amount of padding around the dataset as a fraction of the dataset side
72     lengths.
73     For example, calling \member{setFractionalPadding(0.2, 0.1)} with a data source
74     of size $10 \times 20$ will result in the padded data set size $14 \times 24$
75     (that is $10 \times (1+2 \times 0.2)$ and $20 \times (1+2 \times 0.1)$).
76     By default no padding is applied and \member{pad_y} is ignored for 2-dimensional
77     domains.
78     \end{methoddesc}
79    
80     \begin{methoddesc}[DomainBuilder]{setPadding}{%
81     \optional{pad_x=\None}%
82     \optional{, pad_y=\None}}
83     sets the amount of padding around the dataset in absolute length units.
84     The final domain size will be the length in x (in y) of the dataset plus twice
85     the value of \member{pad_x} (\member{pad_y}). The arguments must be non-negative.
86     By default no padding is applied and \member{pad_y} is ignored for 2-dimensional
87     domains.
88     \end{methoddesc}
89    
90     \begin{methoddesc}[DomainBuilder]{setElementPadding}{%
91     \optional{pad_x=\None}%
92     \optional{, pad_y=\None}}
93     sets the amount of padding around the dataset in number of elements (cells).
94     When the domain is constructed \member{pad_x} (\member{pad_y}) elements are
95     added on each side of the x- (y-) dimension. The arguments must be non-negative.
96     By default no padding is applied and \member{pad_y} is ignored for 2-dimensional
97     domains.
98     \end{methoddesc}
99    
100     \begin{methoddesc}[DomainBuilder]{fixDensityBelow}{%
101     \optional{depth=\None}}
102     defines the depth below which the density anomaly is fixed to zero.
103     This method is only useful for inversions that involve gravity data.
104     \end{methoddesc}
105    
106     \begin{methoddesc}[DomainBuilder]{fixSusceptibilityBelow}{%
107     \optional{depth=\None}}
108     defines the depth below which the susceptibility anomaly is fixed to zero.
109     This method is only useful for inversions that involve magnetic data.
110     \end{methoddesc}
111    
112     \begin{methoddesc}[DomainBuilder]{getGravitySurveys}{}
113     returns a list of all gravity surveys added to the domain builder. See
114     \member{getSurveys()} for more details.
115     \end{methoddesc}
116    
117     \begin{methoddesc}[DomainBuilder]{getMagneticSurveys}{}
118     returns a list of all magnetic surveys added to the domain builder. See
119     \member{getSurveys()} for more details.
120     \end{methoddesc}
121    
122     \begin{methoddesc}[DomainBuilder]{getSurveys}{datatype}
123     returns a list of surveys of type \member{datatype} available to this domain
124     builder. In the current implementation each survey is a tuple of two \Data
125     objects, the first containing anomaly values and the second standard error
126     values for the survey.
127     \end{methoddesc}
128    
129 caltinay 4145 \begin{methoddesc}[DomainBuilder]{getDomain}{}
130 caltinay 4149 returns an \escript domain (see~\cite{ESCRIPT}) suitable for running inversions
131     on the attached data sources.
132     The first time this method is called the target parameters (such as resolution,
133     extents and number of elements) are computed, and the domain is created.
134     Subsequent calls return the same domain instance so calls to
135     \member{setPadding()}, \member{addSource()} and other methods that influence
136     the domain will fail once \member{getDomain()} is called the first time.
137 caltinay 4145 \end{methoddesc}
138    
139 caltinay 4149 \begin{methoddesc}[DomainBuilder]{setBackgroundMagneticFluxDensity}{B}
140     sets the background magnetic flux density $B=(B_r,B_\theta,B_\phi)$ which is
141     required for magnetic inversions.
142     A implementation of the dipole approximation as described in
143     Equation~\ref{ref:MAG:EQU:5} is provided through the function
144     \member{simpleGeoMagneticFluxDensity} (see Section~\ref{sec:ref:DataSource}).
145     $B_\theta$ is ignored for 2-dimensional magnetic inversions.
146     \end{methoddesc}
147 caltinay 4145
148 caltinay 4149 \begin{methoddesc}[DomainBuilder]{getBackgroundMagneticFluxDensity}{}
149     returns the background magnetic flux density $B$ set via
150     \member{setBackgroundMagneticFluxDensity()} in a form suitable for the inversion.
151     There should be no need to call this method directly.
152     \end{methoddesc}
153 gross 4106
154 caltinay 4149 \begin{methoddesc}[DomainBuilder]{getSetDensityMask}{}
155     returns the density mask \Data object which is non-zero for cells that have a
156     fixed density value, zero otherwise.
157     There should be no need to call this method directly.
158     \end{methoddesc}
159 caltinay 4145
160 caltinay 4149 \begin{methoddesc}[DomainBuilder]{getSetSusceptibilityMask}{}
161     returns the susceptibility mask \Data object which is non-zero for cells that
162     have a fixed susceptibility value, zero otherwise.
163     There should be no need to call this method directly.
164     \end{methoddesc}
165    
166 caltinay 4160 \section{\class{DataSource} Class}\label{sec:ref:DataSource}
167 caltinay 4149
168 caltinay 4160 Data sources added to a \class{DomainBuilder} must provide an implementation for
169     a few methods as described in the class template \class{DataSource} from
170     the \module{esys.downunder.datasources} module:
171    
172     \begin{classdesc}{DataSource}{}
173     Base constructor which initializes members and should therefore be invoked by
174     subclasses. Subclasses may then use the member \member{logger} to print any
175     output.
176     \end{classdesc}
177    
178     \begin{methoddesc}[DataSource]{getDataExtents}{}
179     This method should be implemented to return a tuple of tuples
180     ( (x0, y0), (nx, ny), (dx, dy) ), where (x0, y0) denote the UTM coordinates of
181     the data origin, (nx, ny) the number of data points, and (dx, dy) the spacing
182     of data points.
183     \end{methoddesc}
184    
185     \begin{methoddesc}[DataSource]{getDataType}{}
186     Subclasses must return \class{DataSource}.\member{GRAVITY} or
187     \class{DataSource}.\member{MAGNETIC} depending on the type of data they provide.
188     \end{methoddesc}
189    
190     \begin{methoddesc}[DataSource]{getSurveyData}{domain, origin, NE, spacing}
191     This method is called by the \class{DomainBuilder} to retrieve the actual survey
192     data in the form of \Data objects on the \member{domain}.
193     Data sources are responsible to map or interpolate their data onto the domain
194     which has been constructed to fit the data.
195     The domain \member{origin}, number of elements \member{NE} and element
196     \member{spacing} are provided as tuples or lists to aid with interpolation.
197     \end{methoddesc}
198    
199     \begin{methoddesc}[DataSource]{setSubsamplingFactor}{factor}
200     Notifies the data source that data should be subsampled by \member{factor}.
201     This method does not need to be overwritten.
202     See \member{getSubsamplingFactor()} for an explanation.
203     \end{methoddesc}
204    
205     \begin{methoddesc}[DataSource]{getSubsamplingFactor}{}
206     Returns the subsampling factor which was set by \member{setSubsamplingFactor()}
207     or $1$ which indicates that no subsampling is requested.
208     Data sources that support subsampling (or interleaving) of their data should use
209     this method to query the subsampling factor before returning surveys via
210     \member{getSurveyData}. If supported, the factor should be applied in all
211     dimensions. For example, a 2-dimensional dataset with 300 x 150 data points
212     should be reduced to 150 x 75 data points when the subsampling factor equals $2$.
213     Subsampling becomes important when the survey data resolution is too fine or
214     when using data with varying resolution in one inversion.
215     Note that data sources may choose to ignore the subsampling factor if they
216     don't support it.
217     \end{methoddesc}
218    
219     \vspace{1em}\noindent The \module{esys.downunder.datasources} module contains the following helper
220     functions:
221    
222 caltinay 4152 \begin{funcdesc}{simpleGeoMagneticFluxDensity}{latitude%
223     \optional{, longitude=0.}}
224     returns an approximation of the geomagnetic flux density $B$ as described in
225     Equation~\ref{ref:MAG:EQU:5} for the given \member{latitude}.
226     The \member{longitude} parameter is currently ignored and the return value is
227     the tuple $(B_r, B_{\theta}, 0)$.
228 caltinay 4149 \end{funcdesc}
229    
230 caltinay 4152 \begin{funcdesc}{LatLonToUTM}{longitude, latitude%
231     \optional{, wkt_string=\None}}
232     converts one or more (longitude,latitude) pairs to the corresponding (x,y)
233     coordinates in the \emph{Universal Transverse Mercator} (UTM) projection.
234 caltinay 4160 This function requires the \module{pyproj} module for conversion and the
235     \module{gdal} module to parse the \member{wkt_string} parameter if supplied.
236 caltinay 4152 \end{funcdesc}
237    
238 caltinay 4149 \subsection{ER Mapper Raster Data}
239 caltinay 4162 \emph{ER Mapper} files that contain 2-dimensional raster data may be used for
240     inversions through the \class{ErMapperData} class which is derived from
241     \class{DataSource}.
242     Generally, these datasets contain two files, a header file and a data file.
243     The former usually has the \texttt{.ers} file extension and is a text file that
244     describes the data format, size, coordinate system used etc.
245     The data file usually has the same file name but no extension.
246     Note, that the current implementation may not work with all \emph{ER Mapper}
247     datasets. For example, the only cell type understood is \emph{IEEE4ByteReal}
248     at the moment.
249     To run inversions on a \emph{ER Mapper} dataset use the following constructor:
250     \begin{classdesc}{ErMapperData}{datatype, headerfile%
251     \optional{, datafile=\None}%
252     \optional{, altitude=0.}}
253     Creates a new data source from \emph{ER Mapper} data.
254     The parameter \member{datatype} must be one of
255     \class{DataSource}.\member{GRAVITY} or \class{DataSource}.\member{MAGNETIC}
256     depending on the type of data, \member{headerfile} is the name of the header
257     file, \member{datafile} specifies the name of the data file and
258     \member{altitude} specifies the altitude in meters of the measurements.
259     The parameter \member{datafile} can be left blank if the name is identical to
260     the header file except for the file extension. The \member{altitude} parameter
261     is only used with 3-dimensional domains and determines the vertical location
262     of the 2-dimensional slice of data within the domain.
263     \end{classdesc}
264 caltinay 4149
265     \subsection{NetCDF Data}
266 caltinay 4164 The \class{NetCdfData} class from the \module{esys.downunder.datasources} module
267     provides the means to use data from \netcdf files~\cite{netcdf} for inversion.
268     Currently, files that follow the \emph{Climate and Forecast (CF)}\footnote{%
269     \url{http://cf-pcmdi.llnl.gov/documents/cf-conventions/latest-cf-conventions-document-1}}
270     and/or the \emph{Cooperative Ocean/Atmosphere Research Data Service (COARDS)}\footnote{%
271     \url{http://ferret.wrc.noaa.gov/noaa_coop/coop_cdf_profile.html}} metadata
272     conventions are supported.
273     The example script \examplefile{create_netcdf.py} demonstrates how a compatible
274     file can be generated from within \python (provided the \module{scipy} module
275     is available).
276     To plot such an input file including coordinates and legend using
277     \emph{matplotlib}~\cite{matplotlib} see the script \examplefile{show_netcdf.py}.
278     The interface to \class{NetCdfData} looks as follows:
279 caltinay 4149
280 caltinay 4164 \begin{classdesc}{NetCdfData}{datatype, filename%
281     \optional{, altitude=0.}}
282     Creates a new data source from compatible \netcdf data.
283     The parameter \member{datatype} must be one of
284     \class{DataSource}.\member{GRAVITY} or \class{DataSource}.\member{MAGNETIC}
285     depending on the type of data, \member{filename} is the name of the file and
286     \member{altitude} specifies the altitude in meters of the measurements.
287     The \member{altitude} parameter is only used with 3-dimensional domains and
288     determines the vertical location of the 2-dimensional slice of data within the
289     domain.
290     \end{classdesc}
291    
292 caltinay 4149 \subsection{Synthetic Data}
293 caltinay 4178 As a special case the \module{esys.downunder.datasources} module contains
294     classes to generate input data for inversions by solving a forward model with
295     user-defined reference data.
296     The main purpose of using synthetic data is to test the capabilities of the
297     inversion module or for tracking down problems.
298 caltinay 4149
299 caltinay 4178 The base class for synthetic data which is derived from \class{DataSource}
300     has the following interface:
301    
302     \begin{classdesc}{SyntheticDataBase}{datatype%
303     \optional{, DIM=2}
304     \optional{, number_of_elements=10}
305     \optional{, length=1*U.km}
306     \optional{, B_b=\None}
307     \optional{, data_offset=0}
308     \optional{, full_knowledge=\False}
309     \optional{, spherical=\False}}
310     Base class to define reference data based on a given property distribution
311     (density or susceptibility). Data are collected from a square region of
312     vertical extent \member{length} on a grid with \member{number_of_elements}
313     cells in each direction.
314     The synthetic data are constructed by solving the appropriate forward problem.
315     Data can be sampled with an offset from the surface at $z=0$ or using the
316     entire subsurface region.
317     \end{classdesc}
318    
319     \begin{methoddesc}[SyntheticDataBase]{getReferenceProperty}{
320     \optional{domain=\None}}
321     Returns the reference \Data object that was used to generate the gravity or
322     susceptibility anomaly data. The \member{domain} argument must be present
323     when this method is called for the first time but not necessarily in
324     subsequent calls.
325     \end{methoddesc}
326    

  ViewVC Help
Powered by ViewVC 1.1.26