/[escript]/branches/clazy/escriptcore/src/DataLazy.cpp
ViewVC logotype

Annotation of /branches/clazy/escriptcore/src/DataLazy.cpp

Parent Directory Parent Directory | Revision Log Revision Log


Revision 6174 - (hide annotations)
Fri Apr 15 03:41:03 2016 UTC (2 years, 11 months ago) by caltinay
Original Path: trunk/escriptcore/src/DataLazy.cpp
File size: 62471 byte(s)
Some reorganisation of EscriptParams and related changes.
We now distinguish between modifiable parameters and build features.
To interrogate the latter use:
escript::hasFeature()  [ python: escript.hasFeature() ]
and
escript::listFeatures() [ python: escript.listFeatures() ]

I have decided to add specific getters for the remaining few parameters to
avoid the penalty of string comparisons and to remove the wealth of friend
declarations with Data* classes.

Also made some changes in SConstruct to make sure we set *all* preprocessor
directives *before* building anything.

Still to do is dealing with checks for direct solver and gmsh. The latter needs
to be a runtime check rather than a compile time check.

1 jfenwick 1865
2 jfenwick 3981 /*****************************************************************************
3 jfenwick 1865 *
4 jfenwick 5863 * Copyright (c) 2003-2016 by The University of Queensland
5 jfenwick 3981 * http://www.uq.edu.au
6 jfenwick 1865 *
7     * Primary Business: Queensland, Australia
8 jfenwick 6112 * Licensed under the Apache License, version 2.0
9     * http://www.apache.org/licenses/LICENSE-2.0
10 jfenwick 1865 *
11 jfenwick 3981 * Development until 2012 by Earth Systems Science Computational Center (ESSCC)
12 jfenwick 4657 * Development 2012-2013 by School of Earth Sciences
13     * Development from 2014 by Centre for Geoscience Computing (GeoComp)
14 jfenwick 3981 *
15     *****************************************************************************/
16 jfenwick 1865
17     #include "DataLazy.h"
18 caltinay 5972 #include "Data.h"
19     #include "DataTypes.h"
20     #include "EscriptParams.h"
21     #include "FunctionSpace.h"
22     #include "Utils.h"
23 jfenwick 6084 #include "DataVectorOps.h"
24 caltinay 5972
25 caltinay 5997 #include <iomanip> // for some fancy formatting in debug
26 jfenwick 2199
27 jfenwick 5938 using namespace escript::DataTypes;
28    
29 caltinay 5972 #define NO_ARG
30    
31 jfenwick 2157 // #define LAZYDEBUG(X) if (privdebug){X;}
32 jfenwick 2092 #define LAZYDEBUG(X)
33 jfenwick 2157 namespace
34     {
35     bool privdebug=false;
36 jfenwick 2092
37 jfenwick 2157 #define ENABLEDEBUG privdebug=true;
38     #define DISABLEDEBUG privdebug=false;
39     }
40    
41 caltinay 6174 //#define SIZELIMIT if ((m_height>escript::escriptParams.getInt("TOO_MANY_LEVELS")) || (m_children>escript::escriptParams.getInt("TOO_MANY_NODES"))) {cerr << "\n!!!!!!! SIZE LIMIT EXCEEDED " << m_children << ";" << m_height << endl << toString() << endl;resolveToIdentity();}
42 jfenwick 2177
43 caltinay 6174 //#define SIZELIMIT if ((m_height>escript::escriptParams.getInt("TOO_MANY_LEVELS")) || (m_children>escript::escriptParams.getInt("TOO_MANY_NODES"))) {cerr << "SIZE LIMIT EXCEEDED " << m_height << endl;resolveToIdentity();}
44 jfenwick 2472
45 jfenwick 2501
46 caltinay 6174 #define SIZELIMIT \
47     if (m_height > escript::escriptParams.getTooManyLevels()) {\
48     if (escript::escriptParams.getLazyVerbose()) {\
49     cerr << "SIZE LIMIT EXCEEDED height=" << m_height << endl;\
50     }\
51     resolveToIdentity();\
52     }
53 jfenwick 2795
54 jfenwick 1899 /*
55     How does DataLazy work?
56     ~~~~~~~~~~~~~~~~~~~~~~~
57    
58     Each instance represents a single operation on one or two other DataLazy instances. These arguments are normally
59     denoted left and right.
60    
61     A special operation, IDENTITY, stores an instance of DataReady in the m_id member.
62     This means that all "internal" nodes in the structure are instances of DataLazy.
63    
64     Each operation has a string representation as well as an opgroup - eg G_IDENTITY, G_BINARY, ...
65 caltinay 4286 Note that IDENTITY is not considered a unary operation.
66 jfenwick 1899
67     I am avoiding calling the structure formed a tree because it is not guaranteed to be one (eg c=a+a).
68     It must however form a DAG (directed acyclic graph).
69     I will refer to individual DataLazy objects with the structure as nodes.
70    
71     Each node also stores:
72     - m_readytype \in {'E','T','C','?'} ~ indicates what sort of DataReady would be produced if the expression was
73 caltinay 5972 evaluated.
74 caltinay 4286 - m_buffsrequired ~ the large number of samples which would need to be kept simultaneously in order to
75 caltinay 5972 evaluate the expression.
76 jfenwick 1899 - m_samplesize ~ the number of doubles stored in a sample.
77    
78     When a new node is created, the above values are computed based on the values in the child nodes.
79     Eg: if left requires 4 samples and right requires 6 then left+right requires 7 samples.
80    
81     The resolve method, which produces a DataReady from a DataLazy, does the following:
82     1) Create a DataReady to hold the new result.
83     2) Allocate a vector (v) big enough to hold m_buffsrequired samples.
84     3) For each sample, call resolveSample with v, to get its values and copy them into the result object.
85    
86     (In the case of OMP, multiple samples are resolved in parallel so the vector needs to be larger.)
87    
88     resolveSample returns a Vector* and an offset within that vector where the result is stored.
89     Normally, this would be v, but for identity nodes their internal vector is returned instead.
90    
91     The convention that I use, is that the resolve methods should store their results starting at the offset they are passed.
92    
93     For expressions which evaluate to Constant or Tagged, there is a different evaluation method.
94     The collapse method invokes the (non-lazy) operations on the Data class to evaluate the expression.
95 jfenwick 2037
96 jfenwick 2147 To add a new operator you need to do the following (plus anything I might have forgotten - adding a new group for example):
97 jfenwick 2037 1) Add to the ES_optype.
98     2) determine what opgroup your operation belongs to (X)
99     3) add a string for the op to the end of ES_opstrings
100     4) increase ES_opcount
101     5) add an entry (X) to opgroups
102     6) add an entry to the switch in collapseToReady
103     7) add an entry to resolveX
104 jfenwick 1899 */
105    
106    
107 jfenwick 1865 using namespace std;
108 jfenwick 1868 using namespace boost;
109 jfenwick 1865
110     namespace escript
111     {
112    
113     namespace
114     {
115    
116 jfenwick 2777
117     // enabling this will print out when ever the maximum stacksize used by resolve increases
118     // it assumes _OPENMP is also in use
119     //#define LAZY_STACK_PROF
120    
121    
122    
123     #ifndef _OPENMP
124     #ifdef LAZY_STACK_PROF
125     #undef LAZY_STACK_PROF
126     #endif
127     #endif
128    
129    
130     #ifdef LAZY_STACK_PROF
131     std::vector<void*> stackstart(getNumberOfThreads());
132     std::vector<void*> stackend(getNumberOfThreads());
133     size_t maxstackuse=0;
134     #endif
135    
136 jfenwick 1886
137 jfenwick 1865 // return the FunctionSpace of the result of "left op right"
138     FunctionSpace
139     resultFS(DataAbstract_ptr left, DataAbstract_ptr right, ES_optype op)
140     {
141 caltinay 5972 // perhaps this should call interpolate and throw or something?
142     // maybe we need an interpolate node -
143     // that way, if interpolate is required in any other op we can just throw a
144     // programming error exception.
145 jfenwick 1879
146 jfenwick 1943 FunctionSpace l=left->getFunctionSpace();
147     FunctionSpace r=right->getFunctionSpace();
148     if (l!=r)
149     {
150 jfenwick 4264 signed char res=r.getDomain()->preferredInterpolationOnDomain(r.getTypeCode(), l.getTypeCode());
151     if (res==1)
152 jfenwick 1943 {
153 caltinay 5972 return l;
154 jfenwick 1943 }
155 jfenwick 4264 if (res==-1)
156 jfenwick 1943 {
157 caltinay 5972 return r;
158 jfenwick 1943 }
159     throw DataException("Cannot interpolate between the FunctionSpaces given for operation "+opToString(op)+".");
160     }
161     return l;
162 jfenwick 1865 }
163    
164     // return the shape of the result of "left op right"
165 jfenwick 2066 // the shapes resulting from tensor product are more complex to compute so are worked out elsewhere
166 jfenwick 1865 DataTypes::ShapeType
167     resultShape(DataAbstract_ptr left, DataAbstract_ptr right, ES_optype op)
168     {
169 caltinay 5972 if (left->getShape()!=right->getShape())
170     {
171     if ((getOpgroup(op)!=G_BINARY) && (getOpgroup(op)!=G_NP1OUT))
172     {
173     throw DataException("Shapes not the name - shapes must match for (point)binary operations.");
174     }
175 jfenwick 2721
176 caltinay 5972 if (left->getRank()==0) // we need to allow scalar * anything
177     {
178     return right->getShape();
179     }
180     if (right->getRank()==0)
181     {
182     return left->getShape();
183     }
184     throw DataException("Shapes not the same - arguments must have matching shapes (or be scalars) for (point)binary operations on lazy data.");
185     }
186     return left->getShape();
187 jfenwick 1865 }
188    
189 jfenwick 2084 // return the shape for "op left"
190    
191     DataTypes::ShapeType
192 jfenwick 2199 resultShape(DataAbstract_ptr left, ES_optype op, int axis_offset)
193 jfenwick 2084 {
194 caltinay 5972 switch(op)
195     {
196     case TRANS:
197     { // for the scoping of variables
198     const DataTypes::ShapeType& s=left->getShape();
199     DataTypes::ShapeType sh;
200     int rank=left->getRank();
201     if (axis_offset<0 || axis_offset>rank)
202     {
203 caltinay 2779 stringstream e;
204     e << "Error - Data::transpose must have 0 <= axis_offset <= rank=" << rank;
205     throw DataException(e.str());
206 caltinay 5972 }
207     for (int i=0; i<rank; i++)
208     {
209     int index = (axis_offset+i)%rank;
210     sh.push_back(s[index]); // Append to new shape
211     }
212     return sh;
213     }
214     break;
215     case TRACE:
216     {
217     int rank=left->getRank();
218     if (rank<2)
219     {
220     throw DataException("Trace can only be computed for objects with rank 2 or greater.");
221     }
222     if ((axis_offset>rank-2) || (axis_offset<0))
223     {
224     throw DataException("Trace: axis offset must lie between 0 and rank-2 inclusive.");
225     }
226     if (rank==2)
227     {
228     return DataTypes::scalarShape;
229     }
230     else if (rank==3)
231     {
232     DataTypes::ShapeType sh;
233     if (axis_offset==0)
234     {
235     sh.push_back(left->getShape()[2]);
236     }
237     else // offset==1
238     {
239     sh.push_back(left->getShape()[0]);
240     }
241     return sh;
242     }
243     else if (rank==4)
244     {
245     DataTypes::ShapeType sh;
246     const DataTypes::ShapeType& s=left->getShape();
247     if (axis_offset==0)
248     {
249     sh.push_back(s[2]);
250     sh.push_back(s[3]);
251     }
252     else if (axis_offset==1)
253     {
254     sh.push_back(s[0]);
255     sh.push_back(s[3]);
256     }
257     else // offset==2
258     {
259     sh.push_back(s[0]);
260     sh.push_back(s[1]);
261     }
262     return sh;
263     }
264     else // unknown rank
265     {
266     throw DataException("Error - Data::trace can only be calculated for rank 2, 3 or 4 object.");
267     }
268     }
269     break;
270     default:
271     throw DataException("Programmer error - resultShape(left,op) can't compute shapes for operator "+opToString(op)+".");
272     }
273 jfenwick 2084 }
274    
275 jfenwick 2496 DataTypes::ShapeType
276     SwapShape(DataAbstract_ptr left, const int axis0, const int axis1)
277     {
278     // This code taken from the Data.cpp swapaxes() method
279     // Some of the checks are probably redundant here
280     int axis0_tmp,axis1_tmp;
281     const DataTypes::ShapeType& s=left->getShape();
282     DataTypes::ShapeType out_shape;
283     // Here's the equivalent of python s_out=s[axis_offset:]+s[:axis_offset]
284     // which goes thru all shape vector elements starting with axis_offset (at index=rank wrap around to 0)
285     int rank=left->getRank();
286     if (rank<2) {
287     throw DataException("Error - Data::swapaxes argument must have at least rank 2.");
288     }
289     if (axis0<0 || axis0>rank-1) {
290 caltinay 2779 stringstream e;
291     e << "Error - Data::swapaxes: axis0 must be between 0 and rank-1=" << (rank-1);
292     throw DataException(e.str());
293 jfenwick 2496 }
294     if (axis1<0 || axis1>rank-1) {
295 caltinay 2782 stringstream e;
296 caltinay 2779 e << "Error - Data::swapaxes: axis1 must be between 0 and rank-1=" << (rank-1);
297     throw DataException(e.str());
298 jfenwick 2496 }
299     if (axis0 == axis1) {
300     throw DataException("Error - Data::swapaxes: axis indices must be different.");
301     }
302     if (axis0 > axis1) {
303     axis0_tmp=axis1;
304     axis1_tmp=axis0;
305     } else {
306     axis0_tmp=axis0;
307     axis1_tmp=axis1;
308     }
309     for (int i=0; i<rank; i++) {
310     if (i == axis0_tmp) {
311     out_shape.push_back(s[axis1_tmp]);
312     } else if (i == axis1_tmp) {
313     out_shape.push_back(s[axis0_tmp]);
314     } else {
315     out_shape.push_back(s[i]);
316     }
317     }
318     return out_shape;
319     }
320    
321    
322 jfenwick 2066 // determine the output shape for the general tensor product operation
323     // the additional parameters return information required later for the product
324     // the majority of this code is copy pasted from C_General_Tensor_Product
325     DataTypes::ShapeType
326     GTPShape(DataAbstract_ptr left, DataAbstract_ptr right, int axis_offset, int transpose, int& SL, int& SM, int& SR)
327 jfenwick 1865 {
328 caltinay 5972
329 jfenwick 2066 // Get rank and shape of inputs
330     int rank0 = left->getRank();
331     int rank1 = right->getRank();
332     const DataTypes::ShapeType& shape0 = left->getShape();
333     const DataTypes::ShapeType& shape1 = right->getShape();
334    
335     // Prepare for the loops of the product and verify compatibility of shapes
336     int start0=0, start1=0;
337 caltinay 5972 if (transpose == 0) {}
338     else if (transpose == 1) { start0 = axis_offset; }
339     else if (transpose == 2) { start1 = rank1-axis_offset; }
340     else { throw DataException("DataLazy GeneralTensorProduct Constructor: Error - transpose should be 0, 1 or 2"); }
341 jfenwick 2066
342 jfenwick 2085 if (rank0<axis_offset)
343     {
344 caltinay 5972 throw DataException("DataLazy GeneralTensorProduct Constructor: Error - rank of left < axisoffset");
345 jfenwick 2085 }
346 jfenwick 2066
347     // Adjust the shapes for transpose
348 caltinay 5972 DataTypes::ShapeType tmpShape0(rank0); // pre-sizing the vectors rather
349     DataTypes::ShapeType tmpShape1(rank1); // than using push_back
350     for (int i=0; i<rank0; i++) { tmpShape0[i]=shape0[(i+start0)%rank0]; }
351     for (int i=0; i<rank1; i++) { tmpShape1[i]=shape1[(i+start1)%rank1]; }
352 jfenwick 2066
353     // Prepare for the loops of the product
354     SL=1, SM=1, SR=1;
355 caltinay 5972 for (int i=0; i<rank0-axis_offset; i++) {
356 jfenwick 2066 SL *= tmpShape0[i];
357     }
358 caltinay 5972 for (int i=rank0-axis_offset; i<rank0; i++) {
359 jfenwick 2066 if (tmpShape0[i] != tmpShape1[i-(rank0-axis_offset)]) {
360     throw DataException("C_GeneralTensorProduct: Error - incompatible shapes");
361     }
362     SM *= tmpShape0[i];
363     }
364 caltinay 5972 for (int i=axis_offset; i<rank1; i++) {
365 jfenwick 2066 SR *= tmpShape1[i];
366     }
367    
368     // Define the shape of the output (rank of shape is the sum of the loop ranges below)
369 caltinay 5972 DataTypes::ShapeType shape2(rank0+rank1-2*axis_offset);
370     { // block to limit the scope of out_index
371 jfenwick 2066 int out_index=0;
372     for (int i=0; i<rank0-axis_offset; i++, ++out_index) { shape2[out_index]=tmpShape0[i]; } // First part of arg_0_Z
373     for (int i=axis_offset; i<rank1; i++, ++out_index) { shape2[out_index]=tmpShape1[i]; } // Last part of arg_1_Z
374     }
375 jfenwick 2086
376     if (shape2.size()>ESCRIPT_MAX_DATA_RANK)
377     {
378     ostringstream os;
379     os << "C_GeneralTensorProduct: Error - Attempt to create a rank " << shape2.size() << " object. The maximum rank is " << ESCRIPT_MAX_DATA_RANK << ".";
380     throw DataException(os.str());
381     }
382    
383 jfenwick 2066 return shape2;
384 jfenwick 1865 }
385    
386 caltinay 5972 } // end anonymous namespace
387 jfenwick 1865
388 jfenwick 2500 void DataLazy::LazyNodeSetup()
389     {
390     #ifdef _OPENMP
391     int numthreads=omp_get_max_threads();
392     m_samples.resize(numthreads*m_samplesize);
393     m_sampleids=new int[numthreads];
394     for (int i=0;i<numthreads;++i)
395     {
396     m_sampleids[i]=-1;
397     }
398     #else
399     m_samples.resize(m_samplesize);
400     m_sampleids=new int[1];
401     m_sampleids[0]=-1;
402     #endif // _OPENMP
403     }
404 jfenwick 1865
405 jfenwick 2500
406     // Creates an identity node
407 jfenwick 1865 DataLazy::DataLazy(DataAbstract_ptr p)
408 caltinay 5972 : parent(p->getFunctionSpace(),p->getShape())
409     ,m_sampleids(0),
410     m_samples(1)
411 jfenwick 1865 {
412 jfenwick 1879 if (p->isLazy())
413     {
414 caltinay 5972 // I don't want identity of Lazy.
415     // Question: Why would that be so bad?
416     // Answer: We assume that the child of ID is something we can call getVector on
417     throw DataException("Programmer error - attempt to create identity from a DataLazy.");
418 jfenwick 1879 }
419     else
420     {
421 caltinay 5972 DataReady_ptr dr=dynamic_pointer_cast<DataReady>(p);
422     makeIdentity(dr);
423 jfenwick 2199 LAZYDEBUG(cout << "Wrapping " << dr.get() << " id=" << m_id.get() << endl;)
424 jfenwick 1879 }
425 jfenwick 2092 LAZYDEBUG(cout << "(1)Lazy created with " << m_samplesize << endl;)
426 jfenwick 1865 }
427    
428 jfenwick 1886 DataLazy::DataLazy(DataAbstract_ptr left, ES_optype op)
429 caltinay 5972 : parent(left->getFunctionSpace(),(getOpgroup(op)!=G_REDUCTION)?left->getShape():DataTypes::scalarShape),
430     m_op(op),
431     m_axis_offset(0),
432     m_transpose(0),
433     m_SL(0), m_SM(0), m_SR(0)
434 jfenwick 1886 {
435 jfenwick 2721 if ((getOpgroup(op)!=G_UNARY) && (getOpgroup(op)!=G_NP1OUT) && (getOpgroup(op)!=G_REDUCTION))
436 jfenwick 1886 {
437 caltinay 5972 throw DataException("Programmer error - constructor DataLazy(left, op) will only process UNARY operations.");
438 jfenwick 1886 }
439 jfenwick 2066
440 jfenwick 1886 DataLazy_ptr lleft;
441     if (!left->isLazy())
442     {
443 caltinay 5972 lleft=DataLazy_ptr(new DataLazy(left));
444 jfenwick 1886 }
445     else
446     {
447 caltinay 5972 lleft=dynamic_pointer_cast<DataLazy>(left);
448 jfenwick 1886 }
449 jfenwick 1889 m_readytype=lleft->m_readytype;
450 jfenwick 1886 m_left=lleft;
451     m_samplesize=getNumDPPSample()*getNoValues();
452 jfenwick 2177 m_children=m_left->m_children+1;
453     m_height=m_left->m_height+1;
454 jfenwick 2500 LazyNodeSetup();
455 jfenwick 2177 SIZELIMIT
456 jfenwick 1886 }
457    
458    
459 jfenwick 1943 // In this constructor we need to consider interpolation
460 jfenwick 1879 DataLazy::DataLazy(DataAbstract_ptr left, DataAbstract_ptr right, ES_optype op)
461 caltinay 5972 : parent(resultFS(left,right,op), resultShape(left,right,op)),
462     m_op(op),
463     m_SL(0), m_SM(0), m_SR(0)
464 jfenwick 1879 {
465 jfenwick 2199 LAZYDEBUG(cout << "Forming operator with " << left.get() << " " << right.get() << endl;)
466 jfenwick 2037 if ((getOpgroup(op)!=G_BINARY))
467 jfenwick 1886 {
468 caltinay 5972 throw DataException("Programmer error - constructor DataLazy(left, right, op) will only process BINARY operations.");
469 jfenwick 1886 }
470 jfenwick 1943
471 caltinay 5972 if (getFunctionSpace()!=left->getFunctionSpace()) // left needs to be interpolated
472 jfenwick 1943 {
473 caltinay 5972 FunctionSpace fs=getFunctionSpace();
474     Data ltemp(left);
475     Data tmp(ltemp,fs);
476     left=tmp.borrowDataPtr();
477 jfenwick 1943 }
478 caltinay 5972 if (getFunctionSpace()!=right->getFunctionSpace()) // right needs to be interpolated
479 jfenwick 1943 {
480 caltinay 5972 Data tmp(Data(right),getFunctionSpace());
481     right=tmp.borrowDataPtr();
482 jfenwick 2199 LAZYDEBUG(cout << "Right interpolation required " << right.get() << endl;)
483 jfenwick 1943 }
484     left->operandCheck(*right);
485    
486 caltinay 5972 if (left->isLazy()) // the children need to be DataLazy. Wrap them in IDENTITY if required
487 jfenwick 1879 {
488 caltinay 5972 m_left=dynamic_pointer_cast<DataLazy>(left);
489 jfenwick 2199 LAZYDEBUG(cout << "Left is " << m_left->toString() << endl;)
490 jfenwick 1879 }
491     else
492     {
493 caltinay 5972 m_left=DataLazy_ptr(new DataLazy(left));
494 jfenwick 2199 LAZYDEBUG(cout << "Left " << left.get() << " wrapped " << m_left->m_id.get() << endl;)
495 jfenwick 1879 }
496     if (right->isLazy())
497     {
498 caltinay 5972 m_right=dynamic_pointer_cast<DataLazy>(right);
499 jfenwick 2199 LAZYDEBUG(cout << "Right is " << m_right->toString() << endl;)
500 jfenwick 1879 }
501     else
502     {
503 caltinay 5972 m_right=DataLazy_ptr(new DataLazy(right));
504 jfenwick 2199 LAZYDEBUG(cout << "Right " << right.get() << " wrapped " << m_right->m_id.get() << endl;)
505 jfenwick 1879 }
506 jfenwick 1889 char lt=m_left->m_readytype;
507     char rt=m_right->m_readytype;
508     if (lt=='E' || rt=='E')
509     {
510 caltinay 5972 m_readytype='E';
511 jfenwick 1889 }
512     else if (lt=='T' || rt=='T')
513     {
514 caltinay 5972 m_readytype='T';
515 jfenwick 1889 }
516     else
517     {
518 caltinay 5972 m_readytype='C';
519 jfenwick 1889 }
520 jfenwick 2066 m_samplesize=getNumDPPSample()*getNoValues();
521 jfenwick 2177 m_children=m_left->m_children+m_right->m_children+2;
522     m_height=max(m_left->m_height,m_right->m_height)+1;
523 jfenwick 2500 LazyNodeSetup();
524 jfenwick 2177 SIZELIMIT
525 jfenwick 2092 LAZYDEBUG(cout << "(3)Lazy created with " << m_samplesize << endl;)
526 jfenwick 1879 }
527    
528 jfenwick 2066 DataLazy::DataLazy(DataAbstract_ptr left, DataAbstract_ptr right, ES_optype op, int axis_offset, int transpose)
529 caltinay 5972 : parent(resultFS(left,right,op), GTPShape(left,right, axis_offset, transpose, m_SL,m_SM, m_SR)),
530     m_op(op),
531     m_axis_offset(axis_offset),
532     m_transpose(transpose)
533 jfenwick 2066 {
534     if ((getOpgroup(op)!=G_TENSORPROD))
535     {
536 caltinay 5972 throw DataException("Programmer error - constructor DataLazy(left, right, op, ax, tr) will only process BINARY operations which require parameters.");
537 jfenwick 2066 }
538     if ((transpose>2) || (transpose<0))
539     {
540 caltinay 5972 throw DataException("DataLazy GeneralTensorProduct constructor: Error - transpose should be 0, 1 or 2");
541 jfenwick 2066 }
542 caltinay 5972 if (getFunctionSpace()!=left->getFunctionSpace()) // left needs to be interpolated
543 jfenwick 2066 {
544 caltinay 5972 FunctionSpace fs=getFunctionSpace();
545     Data ltemp(left);
546     Data tmp(ltemp,fs);
547     left=tmp.borrowDataPtr();
548 jfenwick 2066 }
549 caltinay 5972 if (getFunctionSpace()!=right->getFunctionSpace()) // right needs to be interpolated
550 jfenwick 2066 {
551 caltinay 5972 Data tmp(Data(right),getFunctionSpace());
552     right=tmp.borrowDataPtr();
553 jfenwick 2066 }
554 jfenwick 2195 // left->operandCheck(*right);
555 jfenwick 1879
556 caltinay 5972 if (left->isLazy()) // the children need to be DataLazy. Wrap them in IDENTITY if required
557 jfenwick 2066 {
558 caltinay 5972 m_left=dynamic_pointer_cast<DataLazy>(left);
559 jfenwick 2066 }
560     else
561     {
562 caltinay 5972 m_left=DataLazy_ptr(new DataLazy(left));
563 jfenwick 2066 }
564     if (right->isLazy())
565     {
566 caltinay 5972 m_right=dynamic_pointer_cast<DataLazy>(right);
567 jfenwick 2066 }
568     else
569     {
570 caltinay 5972 m_right=DataLazy_ptr(new DataLazy(right));
571 jfenwick 2066 }
572     char lt=m_left->m_readytype;
573     char rt=m_right->m_readytype;
574     if (lt=='E' || rt=='E')
575     {
576 caltinay 5972 m_readytype='E';
577 jfenwick 2066 }
578     else if (lt=='T' || rt=='T')
579     {
580 caltinay 5972 m_readytype='T';
581 jfenwick 2066 }
582     else
583     {
584 caltinay 5972 m_readytype='C';
585 jfenwick 2066 }
586     m_samplesize=getNumDPPSample()*getNoValues();
587 jfenwick 2177 m_children=m_left->m_children+m_right->m_children+2;
588     m_height=max(m_left->m_height,m_right->m_height)+1;
589 jfenwick 2500 LazyNodeSetup();
590 jfenwick 2177 SIZELIMIT
591 jfenwick 2092 LAZYDEBUG(cout << "(4)Lazy created with " << m_samplesize << endl;)
592 jfenwick 2066 }
593    
594    
595 jfenwick 2084 DataLazy::DataLazy(DataAbstract_ptr left, ES_optype op, int axis_offset)
596 caltinay 5972 : parent(left->getFunctionSpace(), resultShape(left,op, axis_offset)),
597     m_op(op),
598     m_axis_offset(axis_offset),
599     m_transpose(0),
600     m_tol(0)
601 jfenwick 2084 {
602     if ((getOpgroup(op)!=G_NP1OUT_P))
603     {
604 caltinay 5972 throw DataException("Programmer error - constructor DataLazy(left, op, ax) will only process UNARY operations which require parameters.");
605 jfenwick 2084 }
606     DataLazy_ptr lleft;
607     if (!left->isLazy())
608     {
609 caltinay 5972 lleft=DataLazy_ptr(new DataLazy(left));
610 jfenwick 2084 }
611     else
612     {
613 caltinay 5972 lleft=dynamic_pointer_cast<DataLazy>(left);
614 jfenwick 2084 }
615     m_readytype=lleft->m_readytype;
616     m_left=lleft;
617     m_samplesize=getNumDPPSample()*getNoValues();
618 jfenwick 2177 m_children=m_left->m_children+1;
619     m_height=m_left->m_height+1;
620 jfenwick 2500 LazyNodeSetup();
621 jfenwick 2177 SIZELIMIT
622 jfenwick 2092 LAZYDEBUG(cout << "(5)Lazy created with " << m_samplesize << endl;)
623 jfenwick 2084 }
624    
625 jfenwick 2147 DataLazy::DataLazy(DataAbstract_ptr left, ES_optype op, double tol)
626 caltinay 5972 : parent(left->getFunctionSpace(), left->getShape()),
627     m_op(op),
628     m_axis_offset(0),
629     m_transpose(0),
630     m_tol(tol)
631 jfenwick 2147 {
632     if ((getOpgroup(op)!=G_UNARY_P))
633     {
634 caltinay 5972 throw DataException("Programmer error - constructor DataLazy(left, op, tol) will only process UNARY operations which require parameters.");
635 jfenwick 2147 }
636     DataLazy_ptr lleft;
637     if (!left->isLazy())
638     {
639 caltinay 5972 lleft=DataLazy_ptr(new DataLazy(left));
640 jfenwick 2147 }
641     else
642     {
643 caltinay 5972 lleft=dynamic_pointer_cast<DataLazy>(left);
644 jfenwick 2147 }
645     m_readytype=lleft->m_readytype;
646     m_left=lleft;
647     m_samplesize=getNumDPPSample()*getNoValues();
648 jfenwick 2177 m_children=m_left->m_children+1;
649     m_height=m_left->m_height+1;
650 jfenwick 2500 LazyNodeSetup();
651 jfenwick 2177 SIZELIMIT
652 jfenwick 2147 LAZYDEBUG(cout << "(6)Lazy created with " << m_samplesize << endl;)
653     }
654 jfenwick 2084
655 jfenwick 2496
656     DataLazy::DataLazy(DataAbstract_ptr left, ES_optype op, const int axis0, const int axis1)
657 caltinay 5972 : parent(left->getFunctionSpace(), SwapShape(left,axis0,axis1)),
658     m_op(op),
659     m_axis_offset(axis0),
660     m_transpose(axis1),
661     m_tol(0)
662 jfenwick 2496 {
663     if ((getOpgroup(op)!=G_NP1OUT_2P))
664     {
665 caltinay 5972 throw DataException("Programmer error - constructor DataLazy(left, op, tol) will only process UNARY operations which require two integer parameters.");
666 jfenwick 2496 }
667     DataLazy_ptr lleft;
668     if (!left->isLazy())
669     {
670 caltinay 5972 lleft=DataLazy_ptr(new DataLazy(left));
671 jfenwick 2496 }
672     else
673     {
674 caltinay 5972 lleft=dynamic_pointer_cast<DataLazy>(left);
675 jfenwick 2496 }
676     m_readytype=lleft->m_readytype;
677     m_left=lleft;
678     m_samplesize=getNumDPPSample()*getNoValues();
679     m_children=m_left->m_children+1;
680     m_height=m_left->m_height+1;
681 jfenwick 2500 LazyNodeSetup();
682 jfenwick 2496 SIZELIMIT
683     LAZYDEBUG(cout << "(7)Lazy created with " << m_samplesize << endl;)
684     }
685    
686 jfenwick 3035
687     namespace
688     {
689    
690     inline int max3(int a, int b, int c)
691     {
692 caltinay 5972 int t=(a>b?a:b);
693     return (t>c?t:c);
694 jfenwick 3035
695     }
696     }
697    
698     DataLazy::DataLazy(DataAbstract_ptr mask, DataAbstract_ptr left, DataAbstract_ptr right/*, double tol*/)
699 caltinay 5972 : parent(left->getFunctionSpace(), left->getShape()),
700     m_op(CONDEVAL),
701     m_axis_offset(0),
702     m_transpose(0),
703     m_tol(0)
704 jfenwick 3035 {
705    
706     DataLazy_ptr lmask;
707     DataLazy_ptr lleft;
708     DataLazy_ptr lright;
709     if (!mask->isLazy())
710     {
711 caltinay 5972 lmask=DataLazy_ptr(new DataLazy(mask));
712 jfenwick 3035 }
713     else
714     {
715 caltinay 5972 lmask=dynamic_pointer_cast<DataLazy>(mask);
716 jfenwick 3035 }
717     if (!left->isLazy())
718     {
719 caltinay 5972 lleft=DataLazy_ptr(new DataLazy(left));
720 jfenwick 3035 }
721     else
722     {
723 caltinay 5972 lleft=dynamic_pointer_cast<DataLazy>(left);
724 jfenwick 3035 }
725     if (!right->isLazy())
726     {
727 caltinay 5972 lright=DataLazy_ptr(new DataLazy(right));
728 jfenwick 3035 }
729     else
730     {
731 caltinay 5972 lright=dynamic_pointer_cast<DataLazy>(right);
732 jfenwick 3035 }
733     m_readytype=lmask->m_readytype;
734     if ((lleft->m_readytype!=lright->m_readytype) || (lmask->m_readytype!=lleft->m_readytype))
735     {
736 caltinay 5972 throw DataException("Programmer Error - condEval arguments must have the same readytype");
737 jfenwick 3035 }
738     m_left=lleft;
739     m_right=lright;
740     m_mask=lmask;
741     m_samplesize=getNumDPPSample()*getNoValues();
742     m_children=m_left->m_children+m_right->m_children+m_mask->m_children+1;
743     m_height=max3(m_left->m_height,m_right->m_height,m_mask->m_height)+1;
744     LazyNodeSetup();
745     SIZELIMIT
746     LAZYDEBUG(cout << "(8)Lazy created with " << m_samplesize << endl;)
747     }
748    
749    
750    
751 jfenwick 1865 DataLazy::~DataLazy()
752     {
753 jfenwick 2500 delete[] m_sampleids;
754 jfenwick 1865 }
755    
756 jfenwick 1879
757 jfenwick 1899 /*
758     \brief Evaluates the expression using methods on Data.
759     This does the work for the collapse method.
760     For reasons of efficiency do not call this method on DataExpanded nodes.
761     */
762 jfenwick 1889 DataReady_ptr
763 jfenwick 4621 DataLazy::collapseToReady() const
764 jfenwick 1889 {
765     if (m_readytype=='E')
766 caltinay 5972 { // this is more an efficiency concern than anything else
767 jfenwick 1889 throw DataException("Programmer Error - do not use collapse on Expanded data.");
768     }
769     if (m_op==IDENTITY)
770     {
771     return m_id;
772     }
773     DataReady_ptr pleft=m_left->collapseToReady();
774     Data left(pleft);
775     Data right;
776 jfenwick 2066 if ((getOpgroup(m_op)==G_BINARY) || (getOpgroup(m_op)==G_TENSORPROD))
777 jfenwick 1889 {
778     right=Data(m_right->collapseToReady());
779     }
780     Data result;
781     switch(m_op)
782     {
783     case ADD:
784 caltinay 5972 result=left+right;
785     break;
786     case SUB:
787     result=left-right;
788     break;
789     case MUL:
790     result=left*right;
791     break;
792     case DIV:
793     result=left/right;
794     break;
795 jfenwick 1889 case SIN:
796 caltinay 5972 result=left.sin();
797     break;
798 jfenwick 1889 case COS:
799 caltinay 5972 result=left.cos();
800     break;
801 jfenwick 1889 case TAN:
802 caltinay 5972 result=left.tan();
803     break;
804 jfenwick 1889 case ASIN:
805 caltinay 5972 result=left.asin();
806     break;
807 jfenwick 1889 case ACOS:
808 caltinay 5972 result=left.acos();
809     break;
810 jfenwick 1889 case ATAN:
811 caltinay 5972 result=left.atan();
812     break;
813 jfenwick 1889 case SINH:
814 caltinay 5972 result=left.sinh();
815     break;
816 jfenwick 1889 case COSH:
817 caltinay 5972 result=left.cosh();
818     break;
819 jfenwick 1889 case TANH:
820 caltinay 5972 result=left.tanh();
821     break;
822 jfenwick 1889 case ERF:
823 caltinay 5972 result=left.erf();
824     break;
825 jfenwick 1889 case ASINH:
826 caltinay 5972 result=left.asinh();
827     break;
828 jfenwick 1889 case ACOSH:
829 caltinay 5972 result=left.acosh();
830     break;
831 jfenwick 1889 case ATANH:
832 caltinay 5972 result=left.atanh();
833     break;
834 jfenwick 1889 case LOG10:
835 caltinay 5972 result=left.log10();
836     break;
837 jfenwick 1889 case LOG:
838 caltinay 5972 result=left.log();
839     break;
840 jfenwick 1889 case SIGN:
841 caltinay 5972 result=left.sign();
842     break;
843 jfenwick 1889 case ABS:
844 caltinay 5972 result=left.abs();
845     break;
846 jfenwick 1889 case NEG:
847 caltinay 5972 result=left.neg();
848     break;
849 jfenwick 1889 case POS:
850 caltinay 5972 // it doesn't mean anything for delayed.
851     // it will just trigger a deep copy of the lazy object
852     throw DataException("Programmer error - POS not supported for lazy data.");
853     break;
854 jfenwick 1889 case EXP:
855 caltinay 5972 result=left.exp();
856     break;
857 jfenwick 1889 case SQRT:
858 caltinay 5972 result=left.sqrt();
859     break;
860 jfenwick 1889 case RECIP:
861 caltinay 5972 result=left.oneOver();
862     break;
863 jfenwick 1889 case GZ:
864 caltinay 5972 result=left.wherePositive();
865     break;
866 jfenwick 1889 case LZ:
867 caltinay 5972 result=left.whereNegative();
868     break;
869 jfenwick 1889 case GEZ:
870 caltinay 5972 result=left.whereNonNegative();
871     break;
872 jfenwick 1889 case LEZ:
873 caltinay 5972 result=left.whereNonPositive();
874     break;
875 jfenwick 2147 case NEZ:
876 caltinay 5972 result=left.whereNonZero(m_tol);
877     break;
878 jfenwick 2147 case EZ:
879 caltinay 5972 result=left.whereZero(m_tol);
880     break;
881 jfenwick 2037 case SYM:
882 caltinay 5972 result=left.symmetric();
883     break;
884 jfenwick 2037 case NSYM:
885 jfenwick 6110 result=left.antisymmetric();
886 caltinay 5972 break;
887 jfenwick 2066 case PROD:
888 caltinay 5972 result=C_GeneralTensorProduct(left,right,m_axis_offset, m_transpose);
889     break;
890 jfenwick 2084 case TRANS:
891 caltinay 5972 result=left.transpose(m_axis_offset);
892     break;
893 jfenwick 2084 case TRACE:
894 caltinay 5972 result=left.trace(m_axis_offset);
895     break;
896 jfenwick 2496 case SWAP:
897 caltinay 5972 result=left.swapaxes(m_axis_offset, m_transpose);
898     break;
899 jfenwick 2721 case MINVAL:
900 caltinay 5972 result=left.minval();
901     break;
902 jfenwick 2721 case MAXVAL:
903 caltinay 5972 result=left.minval();
904     break;
905 jfenwick 6125 case HER:
906     result=left.hermitian();
907     break;
908 jfenwick 1889 default:
909 caltinay 5972 throw DataException("Programmer error - collapseToReady does not know how to resolve operator "+opToString(m_op)+".");
910 jfenwick 1889 }
911     return result.borrowReadyPtr();
912     }
913    
914 jfenwick 1899 /*
915     \brief Converts the DataLazy into an IDENTITY storing the value of the expression.
916     This method uses the original methods on the Data class to evaluate the expressions.
917     For this reason, it should not be used on DataExpanded instances. (To do so would defeat
918     the purpose of using DataLazy in the first place).
919     */
920 jfenwick 1889 void
921 jfenwick 4621 DataLazy::collapse() const
922 jfenwick 1889 {
923     if (m_op==IDENTITY)
924     {
925 caltinay 5972 return;
926 jfenwick 1889 }
927     if (m_readytype=='E')
928 caltinay 5972 { // this is more an efficiency concern than anything else
929 jfenwick 1889 throw DataException("Programmer Error - do not use collapse on Expanded data.");
930     }
931     m_id=collapseToReady();
932     m_op=IDENTITY;
933     }
934    
935 jfenwick 2500 // The result will be stored in m_samples
936     // The return value is a pointer to the DataVector, offset is the offset within the return value
937 jfenwick 5938 const DataTypes::RealVectorType*
938 jfenwick 4621 DataLazy::resolveNodeSample(int tid, int sampleNo, size_t& roffset) const
939 jfenwick 2500 {
940     LAZYDEBUG(cout << "Resolve sample " << toString() << endl;)
941 caltinay 5972 // collapse so we have a 'E' node or an IDENTITY for some other type
942 jfenwick 2500 if (m_readytype!='E' && m_op!=IDENTITY)
943     {
944 caltinay 5972 collapse();
945 jfenwick 2500 }
946 caltinay 5972 if (m_op==IDENTITY)
947 jfenwick 2500 {
948 jfenwick 6042 const RealVectorType& vec=m_id->getVectorRO();
949 jfenwick 2500 roffset=m_id->getPointOffset(sampleNo, 0);
950 jfenwick 2777 #ifdef LAZY_STACK_PROF
951     int x;
952     if (&x<stackend[omp_get_thread_num()])
953     {
954     stackend[omp_get_thread_num()]=&x;
955     }
956     #endif
957 jfenwick 2500 return &(vec);
958     }
959     if (m_readytype!='E')
960     {
961     throw DataException("Programmer Error - Collapse did not produce an expanded node.");
962     }
963 jfenwick 2501 if (m_sampleids[tid]==sampleNo)
964     {
965 caltinay 5972 roffset=tid*m_samplesize;
966     return &(m_samples); // sample is already resolved
967 jfenwick 2501 }
968     m_sampleids[tid]=sampleNo;
969 jfenwick 3035
970 jfenwick 2500 switch (getOpgroup(m_op))
971     {
972     case G_UNARY:
973     case G_UNARY_P: return resolveNodeUnary(tid, sampleNo, roffset);
974     case G_BINARY: return resolveNodeBinary(tid, sampleNo, roffset);
975     case G_NP1OUT: return resolveNodeNP1OUT(tid, sampleNo, roffset);
976     case G_NP1OUT_P: return resolveNodeNP1OUT_P(tid, sampleNo, roffset);
977     case G_TENSORPROD: return resolveNodeTProd(tid, sampleNo, roffset);
978     case G_NP1OUT_2P: return resolveNodeNP1OUT_2P(tid, sampleNo, roffset);
979 jfenwick 2721 case G_REDUCTION: return resolveNodeReduction(tid, sampleNo, roffset);
980 jfenwick 3035 case G_CONDEVAL: return resolveNodeCondEval(tid, sampleNo, roffset);
981 jfenwick 2500 default:
982     throw DataException("Programmer Error - resolveSample does not know how to process "+opToString(m_op)+".");
983     }
984     }
985    
986 jfenwick 5938 const DataTypes::RealVectorType*
987 jfenwick 4621 DataLazy::resolveNodeUnary(int tid, int sampleNo, size_t& roffset) const
988 jfenwick 2500 {
989 caltinay 5972 // we assume that any collapsing has been done before we get here
990     // since we only have one argument we don't need to think about only
991     // processing single points.
992     // we will also know we won't get identity nodes
993 jfenwick 2500 if (m_readytype!='E')
994     {
995     throw DataException("Programmer error - resolveUnary should only be called on expanded Data.");
996     }
997     if (m_op==IDENTITY)
998     {
999     throw DataException("Programmer error - resolveNodeUnary should not be called on identity nodes.");
1000     }
1001 jfenwick 5938 const DataTypes::RealVectorType* leftres=m_left->resolveNodeSample(tid, sampleNo, roffset);
1002 jfenwick 2500 const double* left=&((*leftres)[roffset]);
1003     roffset=m_samplesize*tid;
1004     double* result=&(m_samples[roffset]);
1005 jfenwick 6168 if (m_op==POS)
1006 jfenwick 2500 {
1007 jfenwick 6168 // this should be prevented earlier
1008     // operation is meaningless for lazy
1009     throw DataException("Programmer error - POS not supported for lazy data.");
1010 jfenwick 2500 }
1011 jfenwick 6066 tensor_unary_array_operation(m_samplesize,
1012     left,
1013     result,
1014 jfenwick 6168 m_op,
1015 jfenwick 6066 m_tol);
1016 jfenwick 2500 return &(m_samples);
1017     }
1018    
1019    
1020 jfenwick 5938 const DataTypes::RealVectorType*
1021 jfenwick 4621 DataLazy::resolveNodeReduction(int tid, int sampleNo, size_t& roffset) const
1022 jfenwick 2721 {
1023 caltinay 5972 // we assume that any collapsing has been done before we get here
1024     // since we only have one argument we don't need to think about only
1025     // processing single points.
1026     // we will also know we won't get identity nodes
1027 jfenwick 2721 if (m_readytype!='E')
1028     {
1029     throw DataException("Programmer error - resolveUnary should only be called on expanded Data.");
1030     }
1031     if (m_op==IDENTITY)
1032     {
1033     throw DataException("Programmer error - resolveNodeUnary should not be called on identity nodes.");
1034     }
1035     size_t loffset=0;
1036 jfenwick 5938 const DataTypes::RealVectorType* leftres=m_left->resolveNodeSample(tid, sampleNo, loffset);
1037 jfenwick 2721
1038     roffset=m_samplesize*tid;
1039 jfenwick 2734 unsigned int ndpps=getNumDPPSample();
1040 jfenwick 3917 unsigned int psize=DataTypes::noValues(m_left->getShape());
1041 jfenwick 2721 double* result=&(m_samples[roffset]);
1042     switch (m_op)
1043     {
1044     case MINVAL:
1045 caltinay 5972 {
1046     for (unsigned int z=0;z<ndpps;++z)
1047     {
1048     FMin op;
1049 jfenwick 6083 *result=escript::reductionOpVector(*leftres, m_left->getShape(), loffset, op, numeric_limits<double>::max());
1050 caltinay 5972 loffset+=psize;
1051     result++;
1052     }
1053     }
1054     break;
1055 jfenwick 2721 case MAXVAL:
1056 caltinay 5972 {
1057     for (unsigned int z=0;z<ndpps;++z)
1058     {
1059     FMax op;
1060 jfenwick 6083 *result=escript::reductionOpVector(*leftres, m_left->getShape(), loffset, op, numeric_limits<double>::max()*-1);
1061 caltinay 5972 loffset+=psize;
1062     result++;
1063     }
1064     }
1065     break;
1066 jfenwick 2721 default:
1067 caltinay 5972 throw DataException("Programmer error - resolveUnary can not resolve operator "+opToString(m_op)+".");
1068 jfenwick 2721 }
1069     return &(m_samples);
1070     }
1071    
1072 jfenwick 5938 const DataTypes::RealVectorType*
1073 jfenwick 4621 DataLazy::resolveNodeNP1OUT(int tid, int sampleNo, size_t& roffset) const
1074 jfenwick 2500 {
1075 caltinay 5972 // we assume that any collapsing has been done before we get here
1076     // since we only have one argument we don't need to think about only
1077     // processing single points.
1078 jfenwick 2500 if (m_readytype!='E')
1079     {
1080     throw DataException("Programmer error - resolveNodeNP1OUT should only be called on expanded Data.");
1081     }
1082     if (m_op==IDENTITY)
1083     {
1084     throw DataException("Programmer error - resolveNodeNP1OUT should not be called on identity nodes.");
1085     }
1086     size_t subroffset;
1087 jfenwick 6042 const RealVectorType* leftres=m_left->resolveNodeSample(tid, sampleNo, subroffset);
1088 jfenwick 2500 roffset=m_samplesize*tid;
1089     size_t loop=0;
1090     size_t numsteps=(m_readytype=='E')?getNumDPPSample():1;
1091     size_t step=getNoValues();
1092     size_t offset=roffset;
1093     switch (m_op)
1094     {
1095     case SYM:
1096 caltinay 5972 for (loop=0;loop<numsteps;++loop)
1097     {
1098 jfenwick 6083 escript::symmetric(*leftres,m_left->getShape(),subroffset, m_samples, getShape(), offset);
1099 caltinay 5972 subroffset+=step;
1100     offset+=step;
1101     }
1102     break;
1103 jfenwick 2500 case NSYM:
1104 caltinay 5972 for (loop=0;loop<numsteps;++loop)
1105     {
1106 jfenwick 6114 escript::antisymmetric(*leftres,m_left->getShape(),subroffset, m_samples, getShape(), offset);
1107 caltinay 5972 subroffset+=step;
1108     offset+=step;
1109     }
1110     break;
1111 jfenwick 2500 default:
1112 caltinay 5972 throw DataException("Programmer error - resolveNP1OUT can not resolve operator "+opToString(m_op)+".");
1113 jfenwick 2500 }
1114     return &m_samples;
1115     }
1116    
1117 jfenwick 5938 const DataTypes::RealVectorType*
1118 jfenwick 4621 DataLazy::resolveNodeNP1OUT_P(int tid, int sampleNo, size_t& roffset) const
1119 jfenwick 2500 {
1120 caltinay 5972 // we assume that any collapsing has been done before we get here
1121     // since we only have one argument we don't need to think about only
1122     // processing single points.
1123 jfenwick 2500 if (m_readytype!='E')
1124     {
1125     throw DataException("Programmer error - resolveNodeNP1OUT_P should only be called on expanded Data.");
1126     }
1127     if (m_op==IDENTITY)
1128     {
1129     throw DataException("Programmer error - resolveNodeNP1OUT_P should not be called on identity nodes.");
1130     }
1131     size_t subroffset;
1132     size_t offset;
1133 jfenwick 6042 const RealVectorType* leftres=m_left->resolveNodeSample(tid, sampleNo, subroffset);
1134 jfenwick 2500 roffset=m_samplesize*tid;
1135     offset=roffset;
1136     size_t loop=0;
1137     size_t numsteps=(m_readytype=='E')?getNumDPPSample():1;
1138     size_t outstep=getNoValues();
1139     size_t instep=m_left->getNoValues();
1140     switch (m_op)
1141     {
1142     case TRACE:
1143 caltinay 5972 for (loop=0;loop<numsteps;++loop)
1144     {
1145 jfenwick 6083 escript::trace(*leftres,m_left->getShape(),subroffset, m_samples ,getShape(),offset,m_axis_offset);
1146 caltinay 5972 subroffset+=instep;
1147     offset+=outstep;
1148     }
1149     break;
1150 jfenwick 2500 case TRANS:
1151 caltinay 5972 for (loop=0;loop<numsteps;++loop)
1152     {
1153 jfenwick 6083 escript::transpose(*leftres,m_left->getShape(),subroffset, m_samples, getShape(),offset,m_axis_offset);
1154 caltinay 5972 subroffset+=instep;
1155     offset+=outstep;
1156     }
1157     break;
1158 jfenwick 2500 default:
1159 caltinay 5972 throw DataException("Programmer error - resolveNP1OUTP can not resolve operator "+opToString(m_op)+".");
1160 jfenwick 2500 }
1161     return &m_samples;
1162     }
1163    
1164    
1165 jfenwick 5938 const DataTypes::RealVectorType*
1166 jfenwick 4621 DataLazy::resolveNodeNP1OUT_2P(int tid, int sampleNo, size_t& roffset) const
1167 jfenwick 2500 {
1168     if (m_readytype!='E')
1169     {
1170     throw DataException("Programmer error - resolveNodeNP1OUT_2P should only be called on expanded Data.");
1171     }
1172     if (m_op==IDENTITY)
1173     {
1174     throw DataException("Programmer error - resolveNodeNP1OUT_2P should not be called on identity nodes.");
1175     }
1176     size_t subroffset;
1177     size_t offset;
1178 jfenwick 6042 const RealVectorType* leftres=m_left->resolveNodeSample(tid, sampleNo, subroffset);
1179 jfenwick 2500 roffset=m_samplesize*tid;
1180     offset=roffset;
1181     size_t loop=0;
1182     size_t numsteps=(m_readytype=='E')?getNumDPPSample():1;
1183     size_t outstep=getNoValues();
1184     size_t instep=m_left->getNoValues();
1185     switch (m_op)
1186     {
1187     case SWAP:
1188 caltinay 5972 for (loop=0;loop<numsteps;++loop)
1189     {
1190 jfenwick 6083 escript::swapaxes(*leftres,m_left->getShape(),subroffset, m_samples, getShape(),offset, m_axis_offset, m_transpose);
1191 caltinay 5972 subroffset+=instep;
1192     offset+=outstep;
1193     }
1194     break;
1195 jfenwick 2500 default:
1196 caltinay 5972 throw DataException("Programmer error - resolveNodeNP1OUT2P can not resolve operator "+opToString(m_op)+".");
1197 jfenwick 2500 }
1198     return &m_samples;
1199     }
1200    
1201 jfenwick 5938 const DataTypes::RealVectorType*
1202 jfenwick 4621 DataLazy::resolveNodeCondEval(int tid, int sampleNo, size_t& roffset) const
1203 jfenwick 3035 {
1204     if (m_readytype!='E')
1205     {
1206     throw DataException("Programmer error - resolveNodeCondEval should only be called on expanded Data.");
1207     }
1208     if (m_op!=CONDEVAL)
1209     {
1210     throw DataException("Programmer error - resolveNodeCondEval should only be called on CONDEVAL nodes.");
1211     }
1212     size_t subroffset;
1213 jfenwick 2500
1214 jfenwick 6042 const RealVectorType* maskres=m_mask->resolveNodeSample(tid, sampleNo, subroffset);
1215     const RealVectorType* srcres=0;
1216 jfenwick 3035 if ((*maskres)[subroffset]>0)
1217     {
1218 caltinay 5972 srcres=m_left->resolveNodeSample(tid, sampleNo, subroffset);
1219 jfenwick 3035 }
1220     else
1221     {
1222 caltinay 5972 srcres=m_right->resolveNodeSample(tid, sampleNo, subroffset);
1223 jfenwick 3035 }
1224    
1225     // Now we need to copy the result
1226    
1227     roffset=m_samplesize*tid;
1228     for (int i=0;i<m_samplesize;++i)
1229     {
1230 caltinay 5972 m_samples[roffset+i]=(*srcres)[subroffset+i];
1231 jfenwick 3035 }
1232    
1233     return &m_samples;
1234     }
1235    
1236 jfenwick 2500 // This method assumes that any subexpressions which evaluate to Constant or Tagged Data
1237     // have already been collapsed to IDENTITY. So we must have at least one expanded child.
1238     // If both children are expanded, then we can process them in a single operation (we treat
1239     // the whole sample as one big datapoint.
1240     // If one of the children is not expanded, then we need to treat each point in the sample
1241     // individually.
1242     // There is an additional complication when scalar operations are considered.
1243     // For example, 2+Vector.
1244     // In this case each double within the point is treated individually
1245 jfenwick 5938 const DataTypes::RealVectorType*
1246 jfenwick 4621 DataLazy::resolveNodeBinary(int tid, int sampleNo, size_t& roffset) const
1247 jfenwick 2500 {
1248     LAZYDEBUG(cout << "Resolve binary: " << toString() << endl;)
1249    
1250 caltinay 5972 size_t lroffset=0, rroffset=0; // offsets in the left and right result vectors
1251     // first work out which of the children are expanded
1252 jfenwick 2500 bool leftExp=(m_left->m_readytype=='E');
1253     bool rightExp=(m_right->m_readytype=='E');
1254     if (!leftExp && !rightExp)
1255     {
1256 caltinay 5972 throw DataException("Programmer Error - please use collapse if neither argument has type 'E'.");
1257 jfenwick 2500 }
1258     bool leftScalar=(m_left->getRank()==0);
1259     bool rightScalar=(m_right->getRank()==0);
1260     if ((m_left->getRank()!=m_right->getRank()) && (!leftScalar && !rightScalar))
1261     {
1262 caltinay 5972 throw DataException("resolveBinary - ranks of arguments must match unless one of them is scalar.");
1263 jfenwick 2500 }
1264     size_t leftsize=m_left->getNoValues();
1265     size_t rightsize=m_right->getNoValues();
1266 caltinay 5972 size_t chunksize=1; // how many doubles will be processed in one go
1267     int leftstep=0; // how far should the left offset advance after each step
1268 jfenwick 2500 int rightstep=0;
1269 caltinay 5972 int numsteps=0; // total number of steps for the inner loop
1270     int oleftstep=0; // the o variables refer to the outer loop
1271     int orightstep=0; // The outer loop is only required in cases where there is an extended scalar
1272 jfenwick 2500 int onumsteps=1;
1273    
1274 caltinay 5972 bool LES=(leftExp && leftScalar); // Left is an expanded scalar
1275 jfenwick 2500 bool RES=(rightExp && rightScalar);
1276 caltinay 5972 bool LS=(!leftExp && leftScalar); // left is a single scalar
1277 jfenwick 2500 bool RS=(!rightExp && rightScalar);
1278 caltinay 5972 bool LN=(!leftExp && !leftScalar); // left is a single non-scalar
1279 jfenwick 2500 bool RN=(!rightExp && !rightScalar);
1280 caltinay 5972 bool LEN=(leftExp && !leftScalar); // left is an expanded non-scalar
1281 jfenwick 2500 bool REN=(rightExp && !rightScalar);
1282    
1283 caltinay 5972 if ((LES && RES) || (LEN && REN)) // both are Expanded scalars or both are expanded non-scalars
1284 jfenwick 2500 {
1285 caltinay 5972 chunksize=m_left->getNumDPPSample()*leftsize;
1286     leftstep=0;
1287     rightstep=0;
1288     numsteps=1;
1289 jfenwick 2500 }
1290     else if (LES || RES)
1291     {
1292 caltinay 5972 chunksize=1;
1293     if (LES) // left is an expanded scalar
1294     {
1295     if (RS)
1296     {
1297     leftstep=1;
1298     rightstep=0;
1299     numsteps=m_left->getNumDPPSample();
1300     }
1301     else // RN or REN
1302     {
1303     leftstep=0;
1304     oleftstep=1;
1305     rightstep=1;
1306     orightstep=(RN ? -(int)rightsize : 0);
1307     numsteps=rightsize;
1308     onumsteps=m_left->getNumDPPSample();
1309     }
1310     }
1311     else // right is an expanded scalar
1312     {
1313     if (LS)
1314     {
1315     rightstep=1;
1316     leftstep=0;
1317     numsteps=m_right->getNumDPPSample();
1318     }
1319     else
1320     {
1321     rightstep=0;
1322     orightstep=1;
1323     leftstep=1;
1324     oleftstep=(LN ? -(int)leftsize : 0);
1325     numsteps=leftsize;
1326     onumsteps=m_right->getNumDPPSample();
1327     }
1328     }
1329 jfenwick 2500 }
1330 caltinay 5972 else // this leaves (LEN, RS), (LEN, RN) and their transposes
1331 jfenwick 2500 {
1332 caltinay 5972 if (LEN) // and Right will be a single value
1333     {
1334     chunksize=rightsize;
1335     leftstep=rightsize;
1336     rightstep=0;
1337     numsteps=m_left->getNumDPPSample();
1338     if (RS)
1339     {
1340     numsteps*=leftsize;
1341     }
1342     }
1343     else // REN
1344     {
1345     chunksize=leftsize;
1346     rightstep=leftsize;
1347     leftstep=0;
1348     numsteps=m_right->getNumDPPSample();
1349     if (LS)
1350     {
1351     numsteps*=rightsize;
1352     }
1353     }
1354 jfenwick 2500 }
1355    
1356 caltinay 5972 int resultStep=max(leftstep,rightstep); // only one (at most) should be !=0
1357     // Get the values of sub-expressions
1358 jfenwick 6042 const RealVectorType* left=m_left->resolveNodeSample(tid,sampleNo,lroffset);
1359     const RealVectorType* right=m_right->resolveNodeSample(tid,sampleNo,rroffset);
1360 jfenwick 2500 LAZYDEBUG(cout << "Post sub calls in " << toString() << endl;)
1361     LAZYDEBUG(cout << "shapes=" << DataTypes::shapeToString(m_left->getShape()) << "," << DataTypes::shapeToString(m_right->getShape()) << endl;)
1362     LAZYDEBUG(cout << "chunksize=" << chunksize << endl << "leftstep=" << leftstep << " rightstep=" << rightstep;)
1363     LAZYDEBUG(cout << " numsteps=" << numsteps << endl << "oleftstep=" << oleftstep << " orightstep=" << orightstep;)
1364     LAZYDEBUG(cout << "onumsteps=" << onumsteps << endl;)
1365     LAZYDEBUG(cout << " DPPS=" << m_left->getNumDPPSample() << "," <<m_right->getNumDPPSample() << endl;)
1366     LAZYDEBUG(cout << "" << LS << RS << LN << RN << LES << RES <<LEN << REN << endl;)
1367    
1368     LAZYDEBUG(cout << "Left res["<< lroffset<< "]=" << (*left)[lroffset] << endl;)
1369     LAZYDEBUG(cout << "Right res["<< rroffset<< "]=" << (*right)[rroffset] << endl;)
1370    
1371    
1372     roffset=m_samplesize*tid;
1373 caltinay 5972 double* resultp=&(m_samples[roffset]); // results are stored at the vector offset we received
1374 jfenwick 2500 switch(m_op)
1375     {
1376     case ADD:
1377 jfenwick 6042 //PROC_OP(NO_ARG,plus<double>());
1378 jfenwick 6083 escript::binaryOpVectorLazyHelper<real_t, real_t, real_t>(resultp,
1379 jfenwick 6042 &(*left)[0],
1380     &(*right)[0],
1381     chunksize,
1382     onumsteps,
1383     numsteps,
1384     resultStep,
1385     leftstep,
1386     rightstep,
1387     oleftstep,
1388     orightstep,
1389 jfenwick 6056 lroffset,
1390     rroffset,
1391 jfenwick 6168 escript::ES_optype::ADD);
1392 caltinay 5972 break;
1393 jfenwick 2500 case SUB:
1394 jfenwick 6083 escript::binaryOpVectorLazyHelper<real_t, real_t, real_t>(resultp,
1395 jfenwick 6057 &(*left)[0],
1396     &(*right)[0],
1397     chunksize,
1398     onumsteps,
1399     numsteps,
1400     resultStep,
1401     leftstep,
1402     rightstep,
1403     oleftstep,
1404     orightstep,
1405     lroffset,
1406     rroffset,
1407 jfenwick 6168 escript::ES_optype::SUB);
1408 jfenwick 6057 //PROC_OP(NO_ARG,minus<double>());
1409 caltinay 5972 break;
1410 jfenwick 2500 case MUL:
1411 jfenwick 6057 //PROC_OP(NO_ARG,multiplies<double>());
1412 jfenwick 6083 escript::binaryOpVectorLazyHelper<real_t, real_t, real_t>(resultp,
1413 jfenwick 6057 &(*left)[0],
1414     &(*right)[0],
1415     chunksize,
1416     onumsteps,
1417     numsteps,
1418     resultStep,
1419     leftstep,
1420     rightstep,
1421     oleftstep,
1422     orightstep,
1423     lroffset,
1424     rroffset,
1425 jfenwick 6168 escript::ES_optype::MUL);
1426 caltinay 5972 break;
1427 jfenwick 2500 case DIV:
1428 jfenwick 6057 //PROC_OP(NO_ARG,divides<double>());
1429 jfenwick 6083 escript::binaryOpVectorLazyHelper<real_t, real_t, real_t>(resultp,
1430 jfenwick 6057 &(*left)[0],
1431     &(*right)[0],
1432     chunksize,
1433     onumsteps,
1434     numsteps,
1435     resultStep,
1436     leftstep,
1437     rightstep,
1438     oleftstep,
1439     orightstep,
1440     lroffset,
1441     rroffset,
1442 jfenwick 6168 escript::ES_optype::DIV);
1443 caltinay 5972 break;
1444 jfenwick 2500 case POW:
1445 jfenwick 6057 //PROC_OP(double (double,double),::pow);
1446 jfenwick 6083 escript::binaryOpVectorLazyHelper<real_t, real_t, real_t>(resultp,
1447 jfenwick 6057 &(*left)[0],
1448     &(*right)[0],
1449     chunksize,
1450     onumsteps,
1451     numsteps,
1452     resultStep,
1453     leftstep,
1454     rightstep,
1455     oleftstep,
1456     orightstep,
1457     lroffset,
1458     rroffset,
1459 jfenwick 6168 escript::ES_optype::POW);
1460 caltinay 5972 break;
1461 jfenwick 2500 default:
1462 caltinay 5972 throw DataException("Programmer error - resolveBinary can not resolve operator "+opToString(m_op)+".");
1463 jfenwick 2500 }
1464     LAZYDEBUG(cout << "Result res[" << roffset<< "]" << m_samples[roffset] << endl;)
1465     return &m_samples;
1466     }
1467    
1468    
1469     // This method assumes that any subexpressions which evaluate to Constant or Tagged Data
1470     // have already been collapsed to IDENTITY. So we must have at least one expanded child.
1471     // unlike the other resolve helpers, we must treat these datapoints separately.
1472 jfenwick 5938 const DataTypes::RealVectorType*
1473 jfenwick 4621 DataLazy::resolveNodeTProd(int tid, int sampleNo, size_t& roffset) const
1474 jfenwick 2500 {
1475     LAZYDEBUG(cout << "Resolve TensorProduct: " << toString() << endl;)
1476    
1477 caltinay 5972 size_t lroffset=0, rroffset=0; // offsets in the left and right result vectors
1478     // first work out which of the children are expanded
1479 jfenwick 2500 bool leftExp=(m_left->m_readytype=='E');
1480     bool rightExp=(m_right->m_readytype=='E');
1481     int steps=getNumDPPSample();
1482 caltinay 5972 int leftStep=(leftExp? m_left->getNoValues() : 0); // do not have scalars as input to this method
1483 jfenwick 2500 int rightStep=(rightExp?m_right->getNoValues() : 0);
1484    
1485     int resultStep=getNoValues();
1486     roffset=m_samplesize*tid;
1487     size_t offset=roffset;
1488    
1489 jfenwick 6042 const RealVectorType* left=m_left->resolveNodeSample(tid, sampleNo, lroffset);
1490 jfenwick 2500
1491 jfenwick 6042 const RealVectorType* right=m_right->resolveNodeSample(tid, sampleNo, rroffset);
1492 jfenwick 2500
1493     LAZYDEBUG(cerr << "[Left shape]=" << DataTypes::shapeToString(m_left->getShape()) << "\n[Right shape]=" << DataTypes::shapeToString(m_right->getShape()) << " result=" <<DataTypes::shapeToString(getShape()) << endl;
1494     cout << getNoValues() << endl;)
1495    
1496    
1497     LAZYDEBUG(cerr << "Post sub calls: " << toString() << endl;)
1498     LAZYDEBUG(cout << "LeftExp=" << leftExp << " rightExp=" << rightExp << endl;)
1499     LAZYDEBUG(cout << "LeftR=" << m_left->getRank() << " rightExp=" << m_right->getRank() << endl;)
1500     LAZYDEBUG(cout << "LeftSize=" << m_left->getNoValues() << " RightSize=" << m_right->getNoValues() << endl;)
1501     LAZYDEBUG(cout << "m_samplesize=" << m_samplesize << endl;)
1502     LAZYDEBUG(cout << "outputshape=" << DataTypes::shapeToString(getShape()) << endl;)
1503     LAZYDEBUG(cout << "DPPS=" << m_right->getNumDPPSample() <<"."<<endl;)
1504    
1505 caltinay 5972 double* resultp=&(m_samples[offset]); // results are stored at the vector offset we received
1506 jfenwick 2500 switch(m_op)
1507     {
1508     case PROD:
1509 caltinay 5972 for (int i=0;i<steps;++i,resultp+=resultStep)
1510     {
1511     const double *ptr_0 = &((*left)[lroffset]);
1512     const double *ptr_1 = &((*right)[rroffset]);
1513 jfenwick 2500
1514     LAZYDEBUG(cout << DataTypes::pointToString(*left, m_left->getShape(),lroffset,"LEFT") << endl;)
1515     LAZYDEBUG(cout << DataTypes::pointToString(*right,m_right->getShape(),rroffset, "RIGHT") << endl;)
1516    
1517 caltinay 5972 matrix_matrix_product(m_SL, m_SM, m_SR, ptr_0, ptr_1, resultp, m_transpose);
1518 jfenwick 2500
1519 caltinay 5972 lroffset+=leftStep;
1520     rroffset+=rightStep;
1521     }
1522     break;
1523 jfenwick 2500 default:
1524 caltinay 5972 throw DataException("Programmer error - resolveTProduct can not resolve operator "+opToString(m_op)+".");
1525 jfenwick 2500 }
1526     roffset=offset;
1527     return &m_samples;
1528     }
1529    
1530 jfenwick 1898
1531 jfenwick 5938 const DataTypes::RealVectorType*
1532 jfenwick 4621 DataLazy::resolveSample(int sampleNo, size_t& roffset) const
1533 jfenwick 1879 {
1534 jfenwick 2271 #ifdef _OPENMP
1535 caltinay 5972 int tid=omp_get_thread_num();
1536 jfenwick 2271 #else
1537 caltinay 5972 int tid=0;
1538 jfenwick 2271 #endif
1539 jfenwick 2777
1540     #ifdef LAZY_STACK_PROF
1541 caltinay 5972 stackstart[tid]=&tid;
1542     stackend[tid]=&tid;
1543     const DataTypes::RealVectorType* r=resolveNodeSample(tid, sampleNo, roffset);
1544     size_t d=(size_t)stackstart[tid]-(size_t)stackend[tid];
1545     #pragma omp critical
1546     if (d>maxstackuse)
1547     {
1548 jfenwick 2777 cout << "Max resolve Stack use " << d << endl;
1549 caltinay 5972 maxstackuse=d;
1550     }
1551     return r;
1552 jfenwick 2777 #else
1553 caltinay 5972 return resolveNodeSample(tid, sampleNo, roffset);
1554 jfenwick 2777 #endif
1555 jfenwick 2271 }
1556    
1557    
1558 jfenwick 2497 // This needs to do the work of the identity constructor
1559 jfenwick 2177 void
1560     DataLazy::resolveToIdentity()
1561     {
1562     if (m_op==IDENTITY)
1563 caltinay 5972 return;
1564 jfenwick 2500 DataReady_ptr p=resolveNodeWorker();
1565 jfenwick 2177 makeIdentity(p);
1566     }
1567 jfenwick 1889
1568 jfenwick 2177 void DataLazy::makeIdentity(const DataReady_ptr& p)
1569     {
1570     m_op=IDENTITY;
1571     m_axis_offset=0;
1572     m_transpose=0;
1573     m_SL=m_SM=m_SR=0;
1574     m_children=m_height=0;
1575     m_id=p;
1576     if(p->isConstant()) {m_readytype='C';}
1577     else if(p->isExpanded()) {m_readytype='E';}
1578     else if (p->isTagged()) {m_readytype='T';}
1579     else {throw DataException("Unknown DataReady instance in convertToIdentity constructor.");}
1580     m_samplesize=p->getNumDPPSample()*p->getNoValues();
1581     m_left.reset();
1582     m_right.reset();
1583     }
1584    
1585 jfenwick 2497
1586     DataReady_ptr
1587     DataLazy::resolve()
1588     {
1589     resolveToIdentity();
1590     return m_id;
1591     }
1592    
1593 jfenwick 2799
1594     /* This is really a static method but I think that caused problems in windows */
1595     void
1596     DataLazy::resolveGroupWorker(std::vector<DataLazy*>& dats)
1597     {
1598     if (dats.empty())
1599     {
1600 caltinay 5972 return;
1601 jfenwick 2799 }
1602     vector<DataLazy*> work;
1603     FunctionSpace fs=dats[0]->getFunctionSpace();
1604     bool match=true;
1605 jfenwick 2824 for (int i=dats.size()-1;i>=0;--i)
1606 jfenwick 2799 {
1607 caltinay 5972 if (dats[i]->m_readytype!='E')
1608     {
1609     dats[i]->collapse();
1610     }
1611     if (dats[i]->m_op!=IDENTITY)
1612     {
1613     work.push_back(dats[i]);
1614     if (fs!=dats[i]->getFunctionSpace())
1615     {
1616     match=false;
1617     }
1618     }
1619 jfenwick 2799 }
1620     if (work.empty())
1621     {
1622 caltinay 5972 return; // no work to do
1623 jfenwick 2799 }
1624 caltinay 5972 if (match) // all functionspaces match. Yes I realise this is overly strict
1625     { // it is possible that dats[0] is one of the objects which we discarded and
1626     // all the other functionspaces match.
1627     vector<DataExpanded*> dep;
1628 jfenwick 6042 vector<RealVectorType*> vecs;
1629 caltinay 5972 for (int i=0;i<work.size();++i)
1630     {
1631 jfenwick 6042 dep.push_back(new DataExpanded(fs,work[i]->getShape(), RealVectorType(work[i]->getNoValues())));
1632 caltinay 5972 vecs.push_back(&(dep[i]->getVectorRW()));
1633     }
1634     int totalsamples=work[0]->getNumSamples();
1635 jfenwick 6042 const RealVectorType* res=0; // Storage for answer
1636 caltinay 5972 int sample;
1637     #pragma omp parallel private(sample, res)
1638     {
1639     size_t roffset=0;
1640     #pragma omp for schedule(static)
1641     for (sample=0;sample<totalsamples;++sample)
1642     {
1643     roffset=0;
1644     int j;
1645     for (j=work.size()-1;j>=0;--j)
1646     {
1647 jfenwick 2799 #ifdef _OPENMP
1648 caltinay 5972 res=work[j]->resolveNodeSample(omp_get_thread_num(),sample,roffset);
1649 jfenwick 2799 #else
1650 caltinay 5972 res=work[j]->resolveNodeSample(0,sample,roffset);
1651 jfenwick 2799 #endif
1652 caltinay 5972 RealVectorType::size_type outoffset=dep[j]->getPointOffset(sample,0);
1653     memcpy(&((*vecs[j])[outoffset]),&((*res)[roffset]),work[j]->m_samplesize*sizeof(RealVectorType::ElementType));
1654     }
1655     }
1656     }
1657     // Now we need to load the new results as identity ops into the lazy nodes
1658     for (int i=work.size()-1;i>=0;--i)
1659     {
1660 jfenwick 5985 work[i]->makeIdentity(REFCOUNTNS::dynamic_pointer_cast<DataReady>(dep[i]->getPtr()));
1661 caltinay 5972 }
1662 jfenwick 2799 }
1663 caltinay 5972 else // functionspaces do not match
1664 jfenwick 2799 {
1665 caltinay 5972 for (int i=0;i<work.size();++i)
1666     {
1667     work[i]->resolveToIdentity();
1668     }
1669 jfenwick 2799 }
1670     }
1671    
1672    
1673    
1674 jfenwick 2500 // This version of resolve uses storage in each node to hold results
1675     DataReady_ptr
1676     DataLazy::resolveNodeWorker()
1677     {
1678 caltinay 5972 if (m_readytype!='E') // if the whole sub-expression is Constant or Tagged, then evaluate it normally
1679 jfenwick 2500 {
1680     collapse();
1681     }
1682 caltinay 5972 if (m_op==IDENTITY) // So a lazy expression of Constant or Tagged data will be returned here.
1683 jfenwick 2500 {
1684     return m_id;
1685     }
1686 caltinay 5972 // from this point on we must have m_op!=IDENTITY and m_readytype=='E'
1687 jfenwick 6042 DataExpanded* result=new DataExpanded(getFunctionSpace(),getShape(), RealVectorType(getNoValues()));
1688     RealVectorType& resvec=result->getVectorRW();
1689 jfenwick 2500 DataReady_ptr resptr=DataReady_ptr(result);
1690    
1691     int sample;
1692     int totalsamples=getNumSamples();
1693 jfenwick 6042 const RealVectorType* res=0; // Storage for answer
1694 jfenwick 2500 LAZYDEBUG(cout << "Total number of samples=" <<totalsamples << endl;)
1695 jfenwick 2777 #pragma omp parallel private(sample,res)
1696 jfenwick 2500 {
1697 caltinay 5972 size_t roffset=0;
1698 jfenwick 2777 #ifdef LAZY_STACK_PROF
1699 caltinay 5972 stackstart[omp_get_thread_num()]=&roffset;
1700     stackend[omp_get_thread_num()]=&roffset;
1701 jfenwick 2777 #endif
1702 caltinay 5972 #pragma omp for schedule(static)
1703     for (sample=0;sample<totalsamples;++sample)
1704     {
1705     roffset=0;
1706 jfenwick 2500 #ifdef _OPENMP
1707 caltinay 5972 res=resolveNodeSample(omp_get_thread_num(),sample,roffset);
1708 jfenwick 2500 #else
1709 caltinay 5972 res=resolveNodeSample(0,sample,roffset);
1710 jfenwick 2500 #endif
1711     LAZYDEBUG(cout << "Sample #" << sample << endl;)
1712     LAZYDEBUG(cout << "Final res[" << roffset<< "]=" << (*res)[roffset] << (*res)[roffset]<< endl; )
1713 caltinay 5972 RealVectorType::size_type outoffset=result->getPointOffset(sample,0);
1714     memcpy(&(resvec[outoffset]),&((*res)[roffset]),m_samplesize*sizeof(RealVectorType::ElementType));
1715     }
1716 jfenwick 2500 }
1717 jfenwick 2777 #ifdef LAZY_STACK_PROF
1718     for (int i=0;i<getNumberOfThreads();++i)
1719     {
1720 caltinay 5972 size_t r=((size_t)stackstart[i] - (size_t)stackend[i]);
1721     // cout << i << " " << stackstart[i] << " .. " << stackend[i] << " = " << r << endl;
1722     if (r>maxstackuse)
1723     {
1724     maxstackuse=r;
1725     }
1726 jfenwick 2777 }
1727     cout << "Max resolve Stack use=" << maxstackuse << endl;
1728     #endif
1729 jfenwick 2500 return resptr;
1730     }
1731    
1732 jfenwick 1865 std::string
1733     DataLazy::toString() const
1734     {
1735 jfenwick 1886 ostringstream oss;
1736 jfenwick 2791 oss << "Lazy Data: [depth=" << m_height<< "] ";
1737 caltinay 6174 switch (escriptParams.getLazyStrFmt())
1738 jfenwick 2737 {
1739 caltinay 5972 case 1: // tree format
1740     oss << endl;
1741     intoTreeString(oss,"");
1742     break;
1743     case 2: // just the depth
1744     break;
1745 jfenwick 2795 default:
1746 caltinay 5972 intoString(oss);
1747     break;
1748 jfenwick 2737 }
1749 jfenwick 1886 return oss.str();
1750 jfenwick 1865 }
1751    
1752 jfenwick 1899
1753 jfenwick 1886 void
1754     DataLazy::intoString(ostringstream& oss) const
1755     {
1756 jfenwick 2271 // oss << "[" << m_children <<";"<<m_height <<"]";
1757 jfenwick 1886 switch (getOpgroup(m_op))
1758     {
1759 jfenwick 1889 case G_IDENTITY:
1760 caltinay 5972 if (m_id->isExpanded())
1761     {
1762     oss << "E";
1763     }
1764     else if (m_id->isTagged())
1765     {
1766     oss << "T";
1767     }
1768     else if (m_id->isConstant())
1769     {
1770     oss << "C";
1771     }
1772     else
1773     {
1774     oss << "?";
1775     }
1776     oss << '@' << m_id.get();
1777     break;
1778 jfenwick 1886 case G_BINARY:
1779 caltinay 5972 oss << '(';
1780     m_left->intoString(oss);
1781     oss << ' ' << opToString(m_op) << ' ';
1782     m_right->intoString(oss);
1783     oss << ')';
1784     break;
1785 jfenwick 1886 case G_UNARY:
1786 jfenwick 2147 case G_UNARY_P:
1787 jfenwick 2037 case G_NP1OUT:
1788 jfenwick 2084 case G_NP1OUT_P:
1789 jfenwick 2721 case G_REDUCTION:
1790 caltinay 5972 oss << opToString(m_op) << '(';
1791     m_left->intoString(oss);
1792     oss << ')';
1793     break;
1794 jfenwick 2066 case G_TENSORPROD:
1795 caltinay 5972 oss << opToString(m_op) << '(';
1796     m_left->intoString(oss);
1797     oss << ", ";
1798     m_right->intoString(oss);
1799     oss << ')';
1800     break;
1801 jfenwick 2496 case G_NP1OUT_2P:
1802 caltinay 5972 oss << opToString(m_op) << '(';
1803     m_left->intoString(oss);
1804     oss << ", " << m_axis_offset << ", " << m_transpose;
1805     oss << ')';
1806     break;
1807 jfenwick 3035 case G_CONDEVAL:
1808 caltinay 5972 oss << opToString(m_op)<< '(' ;
1809     m_mask->intoString(oss);
1810     oss << " ? ";
1811     m_left->intoString(oss);
1812     oss << " : ";
1813     m_right->intoString(oss);
1814     oss << ')';
1815     break;
1816 jfenwick 1886 default:
1817 caltinay 5972 oss << "UNKNOWN";
1818 jfenwick 1886 }
1819     }
1820    
1821 jfenwick 2737
1822     void
1823     DataLazy::intoTreeString(ostringstream& oss, string indent) const
1824     {
1825     oss << '[' << m_rank << ':' << setw(3) << m_samplesize << "] " << indent;
1826     switch (getOpgroup(m_op))
1827     {
1828     case G_IDENTITY:
1829 caltinay 5972 if (m_id->isExpanded())
1830     {
1831     oss << "E";
1832     }
1833     else if (m_id->isTagged())
1834     {
1835     oss << "T";
1836     }
1837     else if (m_id->isConstant())
1838     {
1839     oss << "C";
1840     }
1841     else
1842     {
1843     oss << "?";
1844     }
1845     oss << '@' << m_id.get() << endl;
1846     break;
1847 jfenwick 2737 case G_BINARY:
1848 caltinay 5972 oss << opToString(m_op) << endl;
1849     indent+='.';
1850     m_left->intoTreeString(oss, indent);
1851     m_right->intoTreeString(oss, indent);
1852     break;
1853 jfenwick 2737 case G_UNARY:
1854     case G_UNARY_P:
1855     case G_NP1OUT:
1856     case G_NP1OUT_P:
1857     case G_REDUCTION:
1858 caltinay 5972 oss << opToString(m_op) << endl;
1859     indent+='.';
1860     m_left->intoTreeString(oss, indent);
1861     break;
1862 jfenwick 2737 case G_TENSORPROD:
1863 caltinay 5972 oss << opToString(m_op) << endl;
1864     indent+='.';
1865     m_left->intoTreeString(oss, indent);
1866     m_right->intoTreeString(oss, indent);
1867     break;
1868 jfenwick 2737 case G_NP1OUT_2P:
1869 caltinay 5972 oss << opToString(m_op) << ", " << m_axis_offset << ", " << m_transpose<< endl;
1870     indent+='.';
1871     m_left->intoTreeString(oss, indent);
1872     break;
1873 jfenwick 2737 default:
1874 caltinay 5972 oss << "UNKNOWN";
1875 jfenwick 2737 }
1876     }
1877    
1878    
1879 jfenwick 1865 DataAbstract*
1880 jfenwick 5938 DataLazy::deepCopy() const
1881 jfenwick 1865 {
1882 jfenwick 1935 switch (getOpgroup(m_op))
1883 jfenwick 1865 {
1884 jfenwick 1935 case G_IDENTITY: return new DataLazy(m_id->deepCopy()->getPtr());
1885 caltinay 5972 case G_UNARY:
1886 jfenwick 2721 case G_REDUCTION: return new DataLazy(m_left->deepCopy()->getPtr(),m_op);
1887 caltinay 5972 case G_UNARY_P: return new DataLazy(m_left->deepCopy()->getPtr(), m_op, m_tol);
1888     case G_BINARY: return new DataLazy(m_left->deepCopy()->getPtr(),m_right->deepCopy()->getPtr(),m_op);
1889 jfenwick 2066 case G_NP1OUT: return new DataLazy(m_left->deepCopy()->getPtr(), m_right->deepCopy()->getPtr(),m_op);
1890     case G_TENSORPROD: return new DataLazy(m_left->deepCopy()->getPtr(), m_right->deepCopy()->getPtr(), m_op, m_axis_offset, m_transpose);
1891 jfenwick 2721 case G_NP1OUT_P: return new DataLazy(m_left->deepCopy()->getPtr(),m_op, m_axis_offset);
1892     case G_NP1OUT_2P: return new DataLazy(m_left->deepCopy()->getPtr(), m_op, m_axis_offset, m_transpose);
1893 jfenwick 1935 default:
1894 caltinay 5972 throw DataException("Programmer error - do not know how to deepcopy operator "+opToString(m_op)+".");
1895 jfenwick 1865 }
1896     }
1897    
1898    
1899 jfenwick 2721
1900 jfenwick 2066 // There is no single, natural interpretation of getLength on DataLazy.
1901     // Instances of DataReady can look at the size of their vectors.
1902     // For lazy though, it could be the size the data would be if it were resolved;
1903     // or it could be some function of the lengths of the DataReady instances which
1904     // form part of the expression.
1905     // Rather than have people making assumptions, I have disabled the method.
1906 jfenwick 5938 DataTypes::RealVectorType::size_type
1907 jfenwick 1865 DataLazy::getLength() const
1908     {
1909 jfenwick 2066 throw DataException("getLength() does not make sense for lazy data.");
1910 jfenwick 1865 }
1911    
1912    
1913     DataAbstract*
1914     DataLazy::getSlice(const DataTypes::RegionType& region) const
1915     {
1916 jfenwick 1879 throw DataException("getSlice - not implemented for Lazy objects.");
1917 jfenwick 1865 }
1918    
1919 jfenwick 1935
1920     // To do this we need to rely on our child nodes
1921 jfenwick 5938 DataTypes::RealVectorType::size_type
1922 jfenwick 1865 DataLazy::getPointOffset(int sampleNo,
1923 jfenwick 1935 int dataPointNo)
1924     {
1925     if (m_op==IDENTITY)
1926     {
1927 caltinay 5972 return m_id->getPointOffset(sampleNo,dataPointNo);
1928 jfenwick 1935 }
1929     if (m_readytype!='E')
1930     {
1931 caltinay 5972 collapse();
1932     return m_id->getPointOffset(sampleNo,dataPointNo);
1933 jfenwick 1935 }
1934     // at this point we do not have an identity node and the expression will be Expanded
1935     // so we only need to know which child to ask
1936     if (m_left->m_readytype=='E')
1937     {
1938 caltinay 5972 return m_left->getPointOffset(sampleNo,dataPointNo);
1939 jfenwick 1935 }
1940     else
1941     {
1942 caltinay 5972 return m_right->getPointOffset(sampleNo,dataPointNo);
1943 jfenwick 1935 }
1944     }
1945    
1946     // To do this we need to rely on our child nodes
1947 jfenwick 5938 DataTypes::RealVectorType::size_type
1948 jfenwick 1935 DataLazy::getPointOffset(int sampleNo,
1949 jfenwick 1865 int dataPointNo) const
1950     {
1951 jfenwick 1935 if (m_op==IDENTITY)
1952     {
1953 caltinay 5972 return m_id->getPointOffset(sampleNo,dataPointNo);
1954 jfenwick 1935 }
1955     if (m_readytype=='E')
1956     {
1957     // at this point we do not have an identity node and the expression will be Expanded
1958     // so we only need to know which child to ask
1959     if (m_left->m_readytype=='E')
1960     {
1961 caltinay 5972 return m_left->getPointOffset(sampleNo,dataPointNo);
1962 jfenwick 1935 }
1963     else
1964     {
1965 caltinay 5972 return m_right->getPointOffset(sampleNo,dataPointNo);
1966 jfenwick 1935 }
1967     }
1968     if (m_readytype=='C')
1969     {
1970 caltinay 5972 return m_left->getPointOffset(sampleNo,dataPointNo); // which child doesn't matter
1971 jfenwick 1935 }
1972     throw DataException("Programmer error - getPointOffset on lazy data may require collapsing (but this object is marked const).");
1973 jfenwick 1865 }
1974    
1975 jfenwick 2271
1976     // I have decided to let Data:: handle this issue.
1977 jfenwick 1901 void
1978     DataLazy::setToZero()
1979     {
1980 jfenwick 5938 // DataTypes::RealVectorType v(getNoValues(),0);
1981 jfenwick 2271 // m_id=DataReady_ptr(new DataConstant(getFunctionSpace(),getShape(),v));
1982     // m_op=IDENTITY;
1983     // m_right.reset();
1984     // m_left.reset();
1985     // m_readytype='C';
1986     // m_buffsRequired=1;
1987    
1988 jfenwick 4634 (void)privdebug; // to stop the compiler complaining about unused privdebug
1989 jfenwick 2271 throw DataException("Programmer error - setToZero not supported for DataLazy (DataLazy objects should be read only).");
1990 jfenwick 1901 }
1991    
1992 jfenwick 2271 bool
1993     DataLazy::actsExpanded() const
1994     {
1995 caltinay 5972 return (m_readytype=='E');
1996 jfenwick 2271 }
1997    
1998 caltinay 5997 } // end namespace
1999 caltinay 5972

  ViewVC Help
Powered by ViewVC 1.1.26