/[escript]/trunk/doc/user/execute.tex
ViewVC logotype

Annotation of /trunk/doc/user/execute.tex

Parent Directory Parent Directory | Revision Log Revision Log


Revision 6688 - (hide annotations)
Mon Jun 25 01:31:06 2018 UTC (14 months ago) by jfenwick
File MIME type: application/x-tex
File size: 13027 byte(s)
Added slightly more doco on lazy

1 caltinay 5293
2     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3 jfenwick 6651 % Copyright (c) 2003-2018 by The University of Queensland
4 caltinay 5293 % http://www.uq.edu.au
5     %
6     % Primary Business: Queensland, Australia
7 jfenwick 6112 % Licensed under the Apache License, version 2.0
8     % http://www.apache.org/licenses/LICENSE-2.0
9 caltinay 5293 %
10     % Development until 2012 by Earth Systems Science Computational Center (ESSCC)
11     % Development 2012-2013 by School of Earth Sciences
12     % Development from 2014 by Centre for Geoscience Computing (GeoComp)
13     %
14     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
15    
16 gross 2316 \chapter{Execution of an {\it escript} Script}
17     \label{EXECUTION}
18    
19     \section{Overview}
20 caltinay 5296 A typical way of starting your {\it escript} script \file{myscript.py} is with
21     the \program{run-escript} command\index{run-escript}\footnote{The
22     \program{run-escript} launcher is not supported under \WINDOWS.}.
23 caltinay 3291 This command was renamed from \program{escript} (used in previous releases) to
24     avoid clashing with an unrelated program installed by default on some systems.
25 caltinay 5296 To run your script, issue\footnote{For this discussion, it is assumed that
26     \program{run-escript} is included in your \env{PATH} environment. See the
27     installation guide for details.}
28 gross 2316 \begin{verbatim}
29 caltinay 3291 run-escript myscript.py
30 gross 2316 \end{verbatim}
31 caltinay 3291 as already shown in \Sec{FirstSteps}.
32     In some cases it can be useful to work interactively, e.g. when debugging a
33     script, with the command
34 gross 2316 \begin{verbatim}
35 jfenwick 2923 run-escript -i myscript.py
36 gross 2316 \end{verbatim}
37 caltinay 3291 This will execute \var{myscript.py} and when it completes (or an error occurs),
38     a \PYTHON prompt will be provided.
39     To leave the prompt press \kbd{Control-d} (\kbd{Control-z} on \WINDOWS).
40 gross 2316
41 caltinay 3291 To run the script using four threads (e.g. if you have a multi-core processor)
42     you can use
43 gross 2316 \begin{verbatim}
44 jfenwick 2923 run-escript -t 4 myscript.py
45 gross 2316 \end{verbatim}
46 caltinay 3291 This requires {\it escript} to be compiled with \OPENMP\cite{OPENMP} support.
47     To run the script using \MPI\cite{MPI} with 8 processes use
48 gross 2316 \begin{verbatim}
49 jfenwick 2923 run-escript -p 8 myscript.py
50 gross 2316 \end{verbatim}
51 caltinay 3291 If the processors which are used are multi-core processors or you are working
52     on a multi-processor shared memory architecture you can use threading in
53     addition to \MPI.
54     For instance to run 8 \MPI processes with 4 threads each, use the command
55 gross 2316 \begin{verbatim}
56 jfenwick 2923 run-escript -p 8 -t 4 myscript.py
57 gross 2316 \end{verbatim}
58 caltinay 3291 In the case of a supercomputer or a cluster, you may wish to distribute the
59     workload over a number of nodes\footnote{For simplicity, we will use the term
60     \emph{node} to refer to either a node in a supercomputer or an individual
61     machine in a cluster}.
62     For example, to use 8 nodes with 4 \MPI processes per node, write
63 gross 2316 \begin{verbatim}
64 jfenwick 2923 run-escript -n 8 -p 4 myscript.py
65 gross 2316 \end{verbatim}
66 caltinay 3291 Since threading has some performance advantages over processes, you may
67     specify a number of threads as well:
68 jfenwick 2331 \begin{verbatim}
69 caltinay 5296 run-escript -n 8 -p 2 -t 4 myscript.py
70 jfenwick 2331 \end{verbatim}
71 caltinay 5296 This runs the script on 8 nodes, with 2 processes per node and 4 threads per
72     process.
73 gross 2316
74     \section{Options}
75 jfenwick 2923 The general form of the \program{run-escript} launcher is as follows:
76 gross 2316
77 jfenwick 2345 %%%%
78     % If you are thinking about changing this please remember to update the man page as well
79     %%%%
80    
81 jfenwick 2923 \program{run-escript}
82 gross 2316 \optional{\programopt{-n \var{nn}}}
83 jfenwick 2331 \optional{\programopt{-p \var{np}}}
84 gross 2316 \optional{\programopt{-t \var{nt}}}
85     \optional{\programopt{-f \var{hostfile}}}
86     \optional{\programopt{-x}}
87     \optional{\programopt{-V}}
88     \optional{\programopt{-e}}
89     \optional{\programopt{-h}}
90     \optional{\programopt{-v}}
91     \optional{\programopt{-o}}
92     \optional{\programopt{-c}}
93     \optional{\programopt{-i}}
94 jfenwick 2343 \optional{\programopt{-b}}
95 caltinay 5296 \optional{\programopt{-m \var{tool}}}
96 gross 2316 \optional{\var{file}}
97     \optional{\var{ARGS}}
98    
99 caltinay 3291 where \var{file} is the name of a script and \var{ARGS} are the arguments to
100     be passed to the script.
101 jfenwick 2923 The \program{run-escript} program will import your current environment variables.
102 caltinay 3291 If no \var{file} is given, then you will be presented with a regular \PYTHON
103     prompt (see \programopt{-i} for restrictions).
104 jfenwick 2331
105 caltinay 3291 The options have the following meaning:
106 gross 2316 \begin{itemize}
107 caltinay 3291 \item[\programopt{-n} \var{nn}] the number of compute nodes \var{nn} to be used.
108 caltinay 5296 The total number of processes being used is $\var{nn} \cdot \var{np}$.
109     This option overrides the value of the \env{ESCRIPT_NUM_NODES}
110 caltinay 3291 environment variable.
111     If a \var{hostfile} is given (see below), the number of nodes needs to
112     match the number of hosts given in that file.
113     If $\var{nn}>1$ but {\it escript} is not compiled for \MPI, a warning is
114     printed but execution is continued with $\var{nn}=1$.
115     If \programopt{-n} is not set the number of hosts in the host file is
116     used. The default value is 1.
117 gross 2316
118 caltinay 5296 \item[\programopt{-p} \var{np}] the number of \MPI processes (per node).
119 caltinay 3291 The total number of processes to be used is $\var{nn} \cdot \var{np}$.
120     This option overwrites the value of the \env{ESCRIPT_NUM_PROCS}
121     environment variable.
122     If $\var{np}>1$ but {\it escript} is not compiled for \MPI, a warning is
123     printed but execution is continued with $\var{np}=1$.
124     The default value is 1.
125 gross 2316
126 caltinay 3291 \item[\programopt{-t} \var{nt}] the number of threads used per process.
127 jfenwick 6678 The option overwrites the value of the \OPENMP environment variable \env{ESCRIPT_NUM_THREADS}.
128 caltinay 3291 If $\var{nt}>1$ but {\it escript} is not compiled for \OPENMP, a warning
129     is printed but execution is continued with $\var{nt}=1$.
130     The default value is 1.
131 gross 2316
132 caltinay 3291 \item[\programopt{-f} \var{hostfile}] the name of a file with a list of host names.
133     Some systems require to specify the addresses or names of the compute
134     nodes where \MPI processes should be spawned.
135     These addresses or names of the compute nodes are listed in the file with
136     the name \var{hostfile}.
137     If \programopt{-n} is set, the number of different hosts defined in \var{hostfile}
138     must be equal to the number of requested compute nodes \var{nn}.
139     The option overwrites the value of the \env{ESCRIPT_HOSTFILE} environment
140     variable. By default no host file is used.
141 gross 2316
142 caltinay 3291 \item[\programopt{-c}] prints information about the settings used to compile {\it escript} and stops execution.
143 gross 2316
144 caltinay 3291 \item[\programopt{-V}] prints the version of {\it escript} and stops execution.
145 gross 2316
146 caltinay 3291 \item[\programopt{-h}] prints a help message and stops execution.
147 gross 2316
148 caltinay 3291 \item[\programopt{-i}] executes the script \var{file} and switches to
149     interactive mode after the execution is finished or an exception has occurred.
150     This option is useful for debugging a script.
151     The option cannot be used if more than one process ($\var{nn} \cdot \var{np}>1$) is used.
152    
153     \item[\programopt{-b}] do not invoke python. This is used to run non-python
154     programs within an environment set for {\it escript}.
155    
156     \item[\programopt{-e}] shows additional environment variables and commands
157     used to set up the {\it escript} environment.
158     This option is useful if users wish to execute scripts without using
159     the \program{run-escript} command.
160    
161     \item[\programopt{-o}] enables the redirection of messages printed by
162     processors with \MPI rank greater than zero to the files
163     \file{stdout_\var{r}.out} and \file{stderr_\var{r}.out} where \var{r} is
164     the rank of the processor.
165     The option overwrites the value of the \env{ESCRIPT_STDFILES} environment
166     variable.
167    
168 caltinay 5296 \item[\programopt{-x}] runs everything within a new \emph{xterm} instance.
169 caltinay 3291
170     \item[\programopt{-v}] prints some diagnostic information.
171 caltinay 5296
172     \item[\programopt{-m} \var{tool}] runs under \emph{valgrind}. The argument
173     \var{tool} must be one of \var{m} (for memcheck), \var{c} (for callgrind),
174     or \var{h} (for cachegrind). Valgrind output is written to a file under
175     \file{valgrind_logs} as reported when {\it escript} terminates.
176 gross 2316 \end{itemize}
177 caltinay 3291
178 gross 2370 \subsection{Notes}
179 caltinay 5296 The \program{run-escript} script is now generated at build time taking into
180     account the \var{prelaunch}, \var{launcher}, and \var{postlaunch} settings
181     passed to \program{scons}. This makes it possible to easily customize the
182     script for different environments, such as batch systems (PBS, SLURM) and
183     different implementations of MPI (Intel, SGI, OpenMPI, etc.).
184 gross 2316
185     \section{Input and Output}
186 caltinay 3291 When \MPI is used on more than one process ($\var{nn} \cdot \var{np} >1$) no
187     input from the standard input is accepted.
188 sshaw 4554 Standard output on any process other than the master process (\var{rank}= 0)
189 caltinay 5296 will be silently discarded by default.
190 jfenwick 2923 Error output from any processor will be redirected to the node where \program{run-escript} has been invoked.
191 sshaw 4554 If the \programopt{-o} option or \env{ESCRIPT_STDFILES} is set\footnote{That is, it has a non-empty value.},
192 caltinay 3291 then the standard and error output from any process other than the master
193 caltinay 5296 process will be written to files of the names \file{stdout_R.out}
194     and \file{stderr_R.out} (where \var{R} is the rank of the process).
195 gross 2316
196 caltinay 3291 If files are created or read by individual \MPI processes with information
197     local to the process (e.g. in the \function{dump} function) and more than one
198     process is used ($\var{nn} \cdot \var{np} >1$), the \MPI process rank is
199     appended to the file names.
200     This is to avoid problems if processes are using a shared file system.
201     Files which collect data that are global for all \MPI processors are created
202     by the process with \MPI rank 0 only.
203     Users should keep in mind that if the file system is not shared among the
204     processes, then a file containing global information which is read by all
205     processors needs to be copied to the local file system(s) before \program{run-escript} is invoked.
206 gross 2316
207 gross 2375 \section{Hints for MPI Programming}
208 caltinay 3291 In general a script based on the \escript module does not require
209     modifications to run under \MPI.
210 jfenwick 2923 However, one needs to be careful if other modules are used.
211 gross 2375
212 caltinay 3291 When \MPI is used on more than one process ($\var{nn} \cdot \var{np} >1$) the
213     user needs to keep in mind that several copies of his script are executed at
214     the same time\footnote{In the case of \OPENMP only one copy is running
215     but {\it escript} temporarily spawns threads.} while data exchange is
216     performed through the \escript module.
217 gross 2375
218 jfenwick 2780 This has three main implications:
219     \begin{enumerate}
220 caltinay 3291 \item most arguments (\var{Data} excluded) should have the same values on all
221     processors, e.g. \var{int}, \var{float}, \var{str} and \numpy parameters.
222     \item the same operations will be called on all processors.
223     \item different processors may store different amounts of information.
224 jfenwick 2780 \end{enumerate}
225    
226 caltinay 3291 With a few exceptions\footnote{\var{getTupleForDataPoint}}, values of
227     types \var{int}, \var{float}, \var{str} and \numpy returned by \escript will
228     have the same value on all processors.
229     If values produced by other modules are used as arguments, the user has to
230     make sure that the argument values are identical on all processors.
231     For instance, the usage of a random number generator to create argument values
232     bears the risk that the value may depend on the processor.
233 jfenwick 2780
234 caltinay 3291 Some operations in \escript require communication with all processors
235     executing the job. It is not always obvious which operations these are.
236 jfenwick 2780 For example, \var{Lsup} returns the largest value on all processors.
237     \var{getValue} on \var{Locator} may refer to a value stored on another processor.
238 caltinay 3291 For this reason it is better if scripts do not have conditional operations
239     (which manipulate data) based on which processor the script is on.
240 jfenwick 2780 Crashing or hanging scripts can be an indication that this has happened.
241    
242     It is not always possible to divide data evenly amongst processors.
243     In fact some processors might not have any data at all.
244 caltinay 3291 Try to avoid writing scripts which iterate over data points, instead try to
245     describe the operation you wish to perform as a whole.
246 jfenwick 2780
247 caltinay 3291 Special attention is required when using files on more than one processor as
248 caltinay 2736 several processors access the file at the same time. Opening a file for
249 caltinay 3291 reading is safe, however the user has to make sure that the variables which
250     are set from reading data from files are identical on all processors.
251 gross 2420
252 caltinay 3291 When writing data to a file it is important that only one processor is writing
253     to the file at any time. As all values in \escript are global it is sufficient
254     to write values on the processor with \MPI rank $0$ only.
255 gross 2420 The \class{FileWriter} class provides a convenient way to write global data
256 caltinay 3291 to a simple file. The following script writes to the file \file{test.txt} on
257     the processor with rank 0 only:
258 gross 2375 \begin{python}
259 caltinay 3291 from esys.escript import FileWriter
260     f = FileWriter('test.txt')
261     f.write('test message')
262     f.close()
263 gross 2375 \end{python}
264 caltinay 3291 We strongly recommend using this class rather than \PYTHON's built-in \function{open}
265     function as it will guarantee a script which will run in single processor mode
266     as well as under \MPI.
267 gross 2375
268 caltinay 3291 If the situation occurs that one of the processors throws an exception, for
269     instance when opening a file for writing fails, the other processors are not
270     automatically made aware of this since \MPI does not handle exceptions.
271     However, \MPI will still terminate the other processes but may not inform the
272     user of the reason in an obvious way.
273     The user needs to inspect the error output files to identify the exception.

  ViewVC Help
Powered by ViewVC 1.1.26