# Diff of /trunk/doc/user/execute.tex

revision 2923 by jfenwick, Thu Feb 4 04:05:36 2010 UTC revision 3291 by caltinay, Thu Oct 21 00:37:42 2010 UTC
# Line 2  Line 2
2  \label{EXECUTION}  \label{EXECUTION}
3
4  \section{Overview}  \section{Overview}
5  A typical way of starting your {\it escript} script \file{myscript.py} is with the \program{run-escript} command\index{run-escript}\footnote{The \program{run-escript} launcher is not supported under \WINDOWS yet.}  A typical way of starting your {\it escript} script \file{myscript.py} is with the \program{run-escript} command\index{run-escript}\footnote{The \program{run-escript} launcher is not supported under \WINDOWS yet.}.
6  This command was renamed from \program{escript} (used in previous releases) to avoid clashing with an unrelated program installed by default on  This command was renamed from \program{escript} (used in previous releases) to
7  some systems.  avoid clashing with an unrelated program installed by default on some systems.
8  Most 3.1~releases\footnote{ie. not \WINDOWS or Ubuntu~9.10} of \escript allow either \program{run-escript} or \program{escript} to be used but the latter name will be removed in future releases:  Most 3.1 releases\footnote{i.e. not \WINDOWS or Ubuntu 9.10} of \escript allow
9  \begin{verbatim}  either \program{run-escript} or \program{escript} to be used but the latter
10  escript myscript.py  name will be removed in future releases. To run your script, issue\footnote{For
11  \end{verbatim}  this discussion, it is assumed that \program{run-escript} is included in
12  as already shown in section~\ref{FirstSteps}\footnote{For this discussion, it is assumed that \program{run-escript} is included in your \env{PATH} environment. See installation guide for details.}  your \env{PATH} environment. See the installation guide for details.}
13  . In some cases  \begin{verbatim}
14  it can be useful to work interactively e.g. when debugging a script, with the command  run-escript myscript.py
15    \end{verbatim}
16    as already shown in \Sec{FirstSteps}.
17    In some cases it can be useful to work interactively, e.g. when debugging a
18    script, with the command
19  \begin{verbatim}  \begin{verbatim}
20  run-escript -i myscript.py  run-escript -i myscript.py
21  \end{verbatim}  \end{verbatim}
22  This will execute \var{myscript.py} and when it completes (or an error occurs), a \PYTHON prompt will be provided.  This will execute \var{myscript.py} and when it completes (or an error occurs),
23  To leave the prompt press \kbd{Control-d}.  a \PYTHON prompt will be provided.
24    To leave the prompt press \kbd{Control-d} (\kbd{Control-z} on \WINDOWS).
25
26  To start  To run the script using four threads (e.g. if you have a multi-core processor)
27  \program{run-escript} using four threads (eg. if you use a multi-core processor) you can use  you can use
28  \begin{verbatim}  \begin{verbatim}
29  run-escript -t 4 myscript.py  run-escript -t 4 myscript.py
30  \end{verbatim}  \end{verbatim}
31  This will require {\it escript} to be compiled for \OPENMP\cite{OPENMP}.  This requires {\it escript} to be compiled with \OPENMP\cite{OPENMP} support.
32    To run the script using \MPI\cite{MPI} with 8 processes use
To start \program{run-escript} using \MPI\cite{MPI} with $8$ processes you use
33  \begin{verbatim}  \begin{verbatim}
34  run-escript -p 8 myscript.py  run-escript -p 8 myscript.py
35  \end{verbatim}  \end{verbatim}
36  If the processors which are used are multi--core processors or multi--processor shared memory architectures you can use threading in addition to \MPI. For instance to run $8$ \MPI processes with using $4$ threads each, you use the command  If the processors which are used are multi-core processors or you are working
37    on a multi-processor shared memory architecture you can use threading in
39    For instance to run 8 \MPI processes with 4 threads each, use the command
40  \begin{verbatim}  \begin{verbatim}
41  run-escript -p 8 -t 4 myscript.py  run-escript -p 8 -t 4 myscript.py
42  \end{verbatim}  \end{verbatim}
43  In the case of a super computer or a cluster, you may wish to distribute the workload over a number of nodes\footnote{For simplicity, we will use the term node to refer to either a node in a super computer or an individual machine in a cluster}.  In the case of a supercomputer or a cluster, you may wish to distribute the
44  For example, to use $8$ nodes, with $4$ \MPI processes per node, write  workload over a number of nodes\footnote{For simplicity, we will use the term
45    \emph{node} to refer to either a node in a supercomputer or an individual
46    machine in a cluster}.
47    For example, to use 8 nodes with 4 \MPI processes per node, write
48  \begin{verbatim}  \begin{verbatim}
49  run-escript -n 8 -p 4 myscript.py  run-escript -n 8 -p 4 myscript.py
50  \end{verbatim}  \end{verbatim}
51  Since threading has some performance advantages over processes, you may specify a number of threads as well.  Since threading has some performance advantages over processes, you may
52    specify a number of threads as well:
53  \begin{verbatim}  \begin{verbatim}
54  run-escript -n 8 -p 4 -t 2 myscript.py  run-escript -n 8 -p 4 -t 2 myscript.py
55  \end{verbatim}  \end{verbatim}
56  This runs the script on $8$ nodes, with $4$ processes per node and $2$ threads per process.  This runs the script on 8 nodes, with 4 processes per node and 2 threads per process.
57
58  \section{Options}  \section{Options}
59  The general form of the \program{run-escript} launcher is as follows:  The general form of the \program{run-escript} launcher is as follows:
# Line 68  The general form of the \program{run-esc Line 79  The general form of the \program{run-esc
79  \optional{\var{file}}  \optional{\var{file}}
80  \optional{\var{ARGS}}  \optional{\var{ARGS}}
81
82  where \var{file} is the name of a script, \var{ARGS} are arguments for the script.  where \var{file} is the name of a script and \var{ARGS} are the arguments to
83    be passed to the script.
84  The \program{run-escript} program will import your current environment variables.  The \program{run-escript} program will import your current environment variables.
85  If no \var{file} is given, then you will be given a \PYTHON prompt (see \programopt{-i} for restrictions).  If no \var{file} is given, then you will be presented with a regular \PYTHON
86    prompt (see \programopt{-i} for restrictions).
87
88  The options are used as follows:  The options have the following meaning:
89  \begin{itemize}  \begin{itemize}
90    \item[\programopt{-n} \var{nn}] the number of compute nodes \var{nn} to be used.
91   \item[\programopt{-n} \var{nn}] the number of compute nodes \var{nn} to be used. The total number of process being used is      The total number of process being used is $\var{nn} \cdot \var{ns}$.
92  $\var{nn} \cdot \var{ns}$. This option overwrites the value of the \env{ESCRIPT_NUM_NODES} environment variable.      This option overwrites the value of the \env{ESCRIPT_NUM_NODES}
93  If a hostfile is given, the number of nodes needs to match the number hosts given in the host file.      environment variable.
94  If $\var{nn}>1$ but {\it escript}  is not compiled for \MPI a warning is printed but execution is continued with $\var{nn}=1$. If \programopt{-n} is not set the      If a \var{hostfile} is given (see below), the number of nodes needs to
95  number of hosts in the host file is used. The default value is 1.      match the number of hosts given in that file.
96        If $\var{nn}>1$ but {\it escript} is not compiled for \MPI, a warning is
97        printed but execution is continued with $\var{nn}=1$.
98        If \programopt{-n} is not set the number of hosts in the host file is
99        used. The default value is 1.
100
101  \item[\programopt{-p} \var{np}] the number of MPI processes per node.  The total number of processes to be used is  \item[\programopt{-p} \var{np}] the number of \MPI processes per node.
102  $\var{nn} \cdot \var{np}$. This option overwrites the value of the \env{ESCRIPT_NUM_PROCS} environment variable. If $\var{np}>1$ but {\it escript}  is not compiled for \MPI a warning is printed but execution is continued with $\var{np}=1$. The default value is 1.      The total number of processes to be used is $\var{nn} \cdot \var{np}$.
103        This option overwrites the value of the \env{ESCRIPT_NUM_PROCS}
104   \item[\programopt{-t} \var{nt}] the number of threads used per processes.      environment variable.
105  The option overwrites the value of the \env{ESCRIPT_NUM_THREADS} environment variable.      If $\var{np}>1$ but {\it escript} is not compiled for \MPI, a warning is
106  If $\var{nt}>1$ but {\it escript} is not compiled for \OPENMP a warning is printed but execution is continued with $\var{nt}=1$. The default value is 1.      printed but execution is continued with $\var{np}=1$.
107        The default value is 1.
108   \item[\programopt{-f} \var{hostfile}] the name of a file with a list of host names. Some systems require to specify the addresses or names of the compute nodes where \MPI process should be spawned. The list of addresses or names of the compute nodes is listed in the file with the name \var{hostfile}. If \programopt{-n} is set the
109  the number of different  \item[\programopt{-t} \var{nt}] the number of threads used per process.
110  hosts defined in \var{hostfile} must be equal to the number of requested compute nodes \var{nn}. The option overwrites the value of the \env{ESCRIPT_HOSTFILE} environment variable. By default value no host file is used.      The option overwrites the value of the \env{ESCRIPT_NUM_THREADS}
111   \item[\programopt{-c}] prints the information about the settings used to compile {\it escript} and stops execution..      environment variable.
112   \item[\programopt{-V}] prints the version of {\it escript} and stops execution.      If $\var{nt}>1$ but {\it escript} is not compiled for \OPENMP, a warning
113   \item[\programopt{-h}] prints a help message and stops execution.      is printed but execution is continued with $\var{nt}=1$.
114   \item[\programopt{-i}] executes the script \var{file} and switches to interactive mode after the execution is finished or an exception has occurred. This option is useful for debugging a script. The option cannot be used if more than one process ($\var{nn} \cdot \var{np}>1$) is used.      The default value is 1.
115  \item[\programopt{-b}] do not invoke python. This is used to run non-python programs.
116    \item[\programopt{-f} \var{hostfile}] the name of a file with a list of host names.
117   \item[\programopt{-e}] shows additional environment variables and commands used to set up the \escript environment.      Some systems require to specify the addresses or names of the compute
118  This option is useful if users wish to execute scripts without using the \program{run-escript} command.      nodes where \MPI processes should be spawned.
119        These addresses or names of the compute nodes are listed in the file with
120   \item[\programopt{-o}] switches on the redirection of output of processors with \MPI rank greater than zero to the files \file{stdout_\var{r}.out} and \file{stderr_\var{r}.out} where \var{r} is the rank of the processor. The option overwrites the value of the \env{ESCRIPT_STDFILES} environment variable      the name \var{hostfile}.
121        If \programopt{-n} is set, the number of different hosts defined in \var{hostfile}
122        must be equal to the number of requested compute nodes \var{nn}.
123        The option overwrites the value of the \env{ESCRIPT_HOSTFILE} environment
124        variable. By default no host file is used.
125
126    \item[\programopt{-c}] prints information about the settings used to compile {\it escript} and stops execution.
127
128    \item[\programopt{-V}] prints the version of {\it escript} and stops execution.
129
130    \item[\programopt{-h}] prints a help message and stops execution.
131
132    \item[\programopt{-i}] executes the script \var{file} and switches to
133        interactive mode after the execution is finished or an exception has occurred.
134        This option is useful for debugging a script.
135        The option cannot be used if more than one process ($\var{nn} \cdot \var{np}>1$) is used.
136
137    \item[\programopt{-b}] do not invoke python. This is used to run non-python
138        programs within an environment set for {\it escript}.
139
140    \item[\programopt{-e}] shows additional environment variables and commands
141        used to set up the {\it escript} environment.
142        This option is useful if users wish to execute scripts without using
143        the \program{run-escript} command.
144
145    \item[\programopt{-o}] enables the redirection of messages printed by
146        processors with \MPI rank greater than zero to the files
147        \file{stdout_\var{r}.out} and \file{stderr_\var{r}.out} where \var{r} is
148        the rank of the processor.
149        The option overwrites the value of the \env{ESCRIPT_STDFILES} environment
150        variable.
151
152  %  \item[\programopt{-x}] interpret \var{file} as an \esysxml \footnote{{\it esysxml} has not been released yet.} task.  %\item[\programopt{-x}] interpret \var{file} as an \esysxml \footnote{{\it esysxml} has not been released yet.} task.
153  % This option is still experimental.  %    This option is still experimental.
154
155   \item[\programopt{-v}] prints some diagnostic information.  \item[\programopt{-v}] prints some diagnostic information.
156  \end{itemize}  \end{itemize}
157
158  \subsection{Notes}  \subsection{Notes}
159  \begin{itemize}  \begin{itemize}
160   \item Make sure that \program{mpiexec} is in your \env{PATH}.   \item Make sure that \program{mpiexec} is in your \env{PATH} if applicable.
161   \item For MPICH and INTELMPI and for the case a hostfile is present   \item For MPICH and INTELMPI and for the case a hostfile is present
162  \program{run-escript} will start the \program{mpd} demon before execution.  \program{run-escript} will start the \program{mpd} daemon before execution.
163  \end{itemize}  \end{itemize}
164
165  \section{Input and Output}  \section{Input and Output}
166  When \MPI is used on more than one process ($\var{nn} \cdot \var{np} >1$) no input from the standard input is accepted. Standard output on any process other than the master process (\var{rank}=0) will not be available.  When \MPI is used on more than one process ($\var{nn} \cdot \var{np} >1$) no
167    input from the standard input is accepted.
168    Standard output on any process other than the master process (\var{rank}=0)
169    will also not be available.
170  Error output from any processor will be redirected to the node where \program{run-escript} has been invoked.  Error output from any processor will be redirected to the node where \program{run-escript} has been invoked.
171  If the \programopt{-o} or \env{ESCRIPT_STDFILES} is set\footnote{That is, it has a non-empty value.}, then the standard and error output from any process other than the master process will be written to files of the names \file{stdout_\var{r}.out} and \file{stderr_\var{r}.out} (where  If the \programopt{-o} Option or \env{ESCRIPT_STDFILES} is set\footnote{That is, it has a non-empty value.},
172  \var{r} is the rank of the process).  then the standard and error output from any process other than the master
173    process will be written to files of the names \file{stdout_\var{r}.out}
174  If files are created or read by individual \MPI processes with information local to the process (e.g in the \function{dump} function)  and more than one process is used ($\var{nn} \cdot \var{np} >1$), the \MPI process rank is appended to the file names.  and \file{stderr_\var{r}.out} (where \var{r} is the rank of the process).
175  This will avoid problems if processes are using a shared file system.
176  Files which collect data which are global for all \MPI processors will be created by the process with \MPI rank 0 only.  If files are created or read by individual \MPI processes with information
177  Users should keep in mind that if the file system is not shared, then a file containing global information  local to the process (e.g. in the \function{dump} function)  and more than one
178  which is read by all processors needs to be copied to the local file system before \program{run-escript} is invoked.  process is used ($\var{nn} \cdot \var{np} >1$), the \MPI process rank is
179    appended to the file names.
180    This is to avoid problems if processes are using a shared file system.
181    Files which collect data that are global for all \MPI processors are created
182    by the process with \MPI rank 0 only.
183    Users should keep in mind that if the file system is not shared among the
184    processes, then a file containing global information which is read by all
185    processors needs to be copied to the local file system(s) before \program{run-escript} is invoked.
186

187  \section{Hints for MPI Programming}  \section{Hints for MPI Programming}
188  In general a script based on the \escript module does not require modifications to run under \MPI.  In general a script based on the \escript module does not require
189    modifications to run under \MPI.
190  However, one needs to be careful if other modules are used.  However, one needs to be careful if other modules are used.
191
192  When \MPI is used on more than one process ($\var{nn} \cdot \var{np} >1$) the user needs to keep in mind that several copies of his script are executed at the same time  When \MPI is used on more than one process ($\var{nn} \cdot \var{np} >1$) the
193  \footnote{In case of OpenMP only one copy is running but \escript temporarily spawns threads.} while data exchange is performed through the \escript module.  user needs to keep in mind that several copies of his script are executed at
194    the same time\footnote{In the case of \OPENMP only one copy is running
195    but {\it escript} temporarily spawns threads.} while data exchange is
196    performed through the \escript module.
197
198  This has three main implications:  This has three main implications:
199  \begin{enumerate}  \begin{enumerate}
200   \item most arguments (\var{Data} excluded) should the same values on all processors. eg \var{int}, \var{float}, \var{str}   \item most arguments (\var{Data} excluded) should have the same values on all
201  and \numpy parameters.       processors, e.g. \var{int}, \var{float}, \var{str} and \numpy parameters.
202  \item the same operations will be called on all processors.   \item the same operations will be called on all processors.
203  \item different processors may store different amounts of information.   \item different processors may store different amounts of information.
204  \end{enumerate}  \end{enumerate}
205
206  With a few exceptions\footnote{getTupleForDataPoint}, values of types \var{int}, \var{float}, \var{str}  With a few exceptions\footnote{\var{getTupleForDataPoint}}, values of
207  and \numpy returned by \escript will have the same value on all processors.  types \var{int}, \var{float}, \var{str} and \numpy returned by \escript will
208  If values produced by other modules are used as arguments, the user has to make sure that the argument values are identical  have the same value on all processors.
209   on all processors. For instance, the usage of a random number generator to create argument values bears the risk that  If values produced by other modules are used as arguments, the user has to
210  the value may depend on the processor.  make sure that the argument values are identical on all processors.
211    For instance, the usage of a random number generator to create argument values
212    bears the risk that the value may depend on the processor.
213
214  Some operations in \escript require communication with all processors executing the job.  Some operations in \escript require communication with all processors
215  It is not always obvious which operations these are.  executing the job. It is not always obvious which operations these are.
216  For example, \var{Lsup} returns the largest value on all processors.  For example, \var{Lsup} returns the largest value on all processors.
217  \var{getValue} on \var{Locator} may refer to a value stored on another processor.  \var{getValue} on \var{Locator} may refer to a value stored on another processor.
218  For this reason it is better if scripts do not have conditional operations (which manipulate data) based on which processor the script is on.  For this reason it is better if scripts do not have conditional operations
219    (which manipulate data) based on which processor the script is on.
220  Crashing or hanging scripts can be an indication that this has happened.  Crashing or hanging scripts can be an indication that this has happened.
221
222  It is not always possible to divide data evenly amongst processors.  It is not always possible to divide data evenly amongst processors.
223  In fact some processors might not have any data at all.  In fact some processors might not have any data at all.
224  Try to avoid writing scripts which iterate over data points,  Try to avoid writing scripts which iterate over data points, instead try to
225  instead try to describe the operation you wish to perform as a whole.  describe the operation you wish to perform as a whole.
226
227  Special attention is required when using files on more than one processor as  Special attention is required when using files on more than one processor as
228  several processors access the file at the same time. Opening a file for  several processors access the file at the same time. Opening a file for
229  reading is safe, however the user has to make sure that the variables which are  reading is safe, however the user has to make sure that the variables which
230  set from reading data from files are identical on all processors.  are set from reading data from files are identical on all processors.
231
232  When writing data to a file it is important that only one processor is writing to  When writing data to a file it is important that only one processor is writing
233  the file at any time. As all values in \escript are global it is sufficient  to the file at any time. As all values in \escript are global it is sufficient
234  to write values on the processor with \MPI rank $0$ only.  to write values on the processor with \MPI rank $0$ only.
235  The \class{FileWriter} class provides a convenient way to write global data  The \class{FileWriter} class provides a convenient way to write global data
236  to a simple file.  The following script writes to the file  to a simple file.  The following script writes to the file \file{test.txt} on
237  \var{'test.txt'} on the processor with id $0$ only:  the processor with rank 0 only:
238  \begin{python}  \begin{python}
239  from esys.escript import *    from esys.escript import FileWriter
240  f = FileWriter('test.txt')    f = FileWriter('test.txt')
241  f.write('test message')    f.write('test message')
242  f.close()    f.close()
243  \end{python}  \end{python}
244  We strongly recommend using this class rather than the built-in \function{open}  We strongly recommend using this class rather than \PYTHON's built-in \function{open}
245  function as it will guarantee a script which will run in single processor mode as well as under \MPI.  function as it will guarantee a script which will run in single processor mode
246    as well as under \MPI.
247  If there is the situation that one of the processors is throwing an exception,
248  for instance as opening a file for writing fails, the other processors  If the situation occurs that one of the processors throws an exception, for
249  are not automatically made aware of this since \MPI  instance when opening a file for writing fails, the other processors are not
250  dioes not handle exceptions.  automatically made aware of this since \MPI does not handle exceptions.
251  However, \MPI will terminate the other processes but  However, \MPI will still terminate the other processes but may not inform the
252  may not inform the user of the reason in an obvious way. The user needs to inspect the  user of the reason in an obvious way.
253  error output files to identify the exception.  The user needs to inspect the error output files to identify the exception.
254
255  \section{Lazy Evaluation}  \section{Lazy Evaluation}
256  \label{sec:lazy}  \label{sec:lazy}
257  Escript now supports lazy evaluation~\cite{lazyauspdc}.  Escript now supports lazy evaluation~\cite{lazyauspdc}.
258  Lazy evaluation is when expressions are not evaluated until it is actually needed.  Lazy evaluation is when expressions are not evaluated until they are actually
259  When applied to suitable problems, it can reduce both the memory and cpu time required to  needed.
260  perform a simulation.  When applied to suitable problems, it can reduce both the memory and CPU time
261  This implementation is designed to be as transparent as possible; so significant  required to perform a simulation.
262  alterations to scripts are not required.  This implementation is designed to be as transparent as possible; so
263    significant alterations to scripts are not required.
264
265  \subsection*{How to use it}  \subsection*{How to use it}
266  To have lazy evaluation applied automatically, put the following command in your script  To have lazy evaluation applied automatically, put the following command in
267  after the imports.  your script after the imports.
268
269  \begin{python}  \begin{python}
270  from esys.escript import setEscriptParamInt    from esys.escript import setEscriptParamInt
271  setEscriptParamInt('AUTOLAZY',1)    setEscriptParamInt('AUTOLAZY', 1)
272  \end{python}  \end{python}
273
274  To get greater benefit, some fine tuning may be required.  To get greater benefit, some fine tuning may be required.
275  If your simulation involves iterating for a number of timesteps,  If your simulation involves iterating for a number of time steps,
276  you will probably have some state variables which are updated in  you will probably have some state variables which are updated in
277  each iteration based on their value in the previous iteration.  each iteration based on their value in the previous iteration.
278  For example:  For example,
279
280  \begin{python}  \begin{python}
281  x=f(x_previous)    x=f(x_previous)
282  y=g(x)    y=g(x)
283  z=h(y,x, ...)    z=h(y, x, ...)
284  \end{python}  \end{python}
285
286  Could be modified to:  could be modified to:
287
288  \begin{python}  \begin{python}
289  x=f(x_previous)    x=f(x_previous)
290  resolve(x)    resolve(x)
291  y=g(x)    y=g(x)
292  z=h(y,x, ...)    z=h(y, x, ...)
293  \end{python}  \end{python}
294
295  The resolve command forces x to be evaluated immediately.  The \code{resolve} command forces x to be evaluated immediately.
296
297  \subsection*{When to use it}  \subsection*{When to use it}
298  We believe that problems involving large domains and complicated expressions  We believe that problems involving large domains and complicated expressions
299  will benefit most from lazy evaluation.  will benefit most from lazy evaluation.
300  In cases where lazy does provide a benefit, larger domains should provide  In cases where lazy does provide a benefit, larger domains should give a
301  larger benefit.  greater benefit.
302  If you are uncertain, try running a test on a smaller domain.  If you are uncertain, try running a test on a smaller domain first.

303

Legend:
 Removed from v.2923 changed lines Added in v.3291