Collective Array Operations =========================== Global Arrays provide functions for collective array operations, targeting both whole arrays and patches (portions of global arrays). Collective operations require all the processes to make the call. In the underlying implementation, each process deals with its local data. These functions include - basic array operations, - linear algebra operations, and - interfaces to third party software packages. Basic Array Operations ---------------------- Global Arrays provide several mechanisms to manipulate contents of the arrays. One can set all the elements in an array/patch to a specific value, or as a special case set to zero. Since GA does not explicitly initialize newly created arrays, these calls are useful for initialization of an array/patch. (To fill the array with different values for each element, one can choose the one sided operation putor each process can initialize its local portion of an array/patch like ordinary local memory). One can also scale the array/patch by a certain factor, or copy the contents of one array/patch to another. Whole Arrays ~~~~~~~~~~~~ These functions apply to the entire array. The function - Fortran subroutine: `ga_zero `__\ (g_a) - C: void `GA_Zero `__\ (int g_a) - C++: void GA::GlobalArray::zero() sets all the elements in the array to zero. To assign a single value to all the elements in an array, use the function - Fortran subroutine: `ga_fill `__\ (g_a, val) - C: void `GA_Fill `__\ (int g_a, void \*val) - C++: void GA::GlobalArray::fill(void \*val) It sets all the elements in the array to the value *val*. The *val* must have the same data type as that of the array. The function - Fortran subroutine: `ga_scale `__\ (g_a, val) - C: void `GA_scale `__\ (int g_a, void \*val) - C++: void GA::GlobalArray::scale(void \*val) scales all the elements in the array by factor *val*. Again the val must be the same data type as that of the array itself. The above three functions are dealing with one global array, to set values or change all the elements together. The following functions are for copying data between two arrays. The function - Fortran subroutine: `ga_copy `__\ (g_a, g_b) - C: void `GA_copy `__\ (int g_a, int g_b) - - C++: void GA::GlobalArray::copy(const GA::GlobalArray \* g_a) copies the contents of one array to another. The arrays must be of the same data type and have the same number of elements. For global arrays containing ghost cells, the ghost cell data can be filled in with the corresponding data from neighboring processors using the function - n-D Fortran subroutine: `ga_copy `__\ (g_a, g_b) - C: void `GA_copy `__\ (int g_a, int g_b) - C++: void GA::GlobalArray::copy(const GA::GlobalArray \* g_a) - n-D Fortran subroutine: `ga_update_ghosts `__\ (g_a) - C: void `ga_update_ghosts `__\ (int g_a) - C++: void GA::GlobalArray::updateGhosts() This operation updates the ghost cell data by assuming periodic, or wrap-around, boundary conditions similar to those described for the\ ``nga_periodic_get`` operations described above. The wrap-around conditions are always applied, it is up to the individual application to decide whether or not the data in the ghost cells should be used. The update operation is illustrated below for a simple 4x2 global array distributed across two processors. The ghost cells are one element wide in each dimension. .. figure:: images/ghost012.png :width: 60% .. figure:: images/ghost015.png :width: 60% - n-d Fortran logical function: `nga_update_ghosts_dir `__\ (g_a, dimension, idir, flag) - C: int `NGA_Update_ghosts_dir `__\ (int g_a, int dimension, int idir, int cflag) - C++: int GA::GlobalArray::updateGhostsDir(int dimension, int idir, int cflag) This function can be used to update the ghost cells along individual directions. It is designed for algorithms that can overlap updates with computation. The variable dimension indicates which coordinate direction is to be updated (e.g. dimension = 1 would correspond to the y axis in a two or three dimensional system), the variable idir can take the values +/-1 and indicates whether the side that is to be updated lies in the positive or negative direction, and cflag indicates whether or not the corners on the side being updated are to be included in the update. The following calls would be equivalent to a call to ``GA_Update_ghosts`` for a 2-dimensional system: :: status = NGA_Update_ghost_dir(g_a,0,-1,1); status = NGA_Update_ghost_dir(g_a,0,1,1); status = NGA_Update_ghost_dir(g_a,1,-1,0); status = NGA_Update_ghost_dir(g_a,1,1,0); The variable *cflag* is set equal to 1 (or non-zero) in the first two calls so that the corner ghost cells are update, it is set equal to 0 in the second two calls to avoid redundant updates of the corners. Note that updating the ghosts cells using several independent calls to the ``nga_update_ghost_dir`` functions is generally not as efficient as using ``GA_Update_ghosts`` unless the individual calls can be effectively overlapped with computation. This is a collective operation. Patches ~~~~~~~ GA provides a set of operations on segments of the global arrays, namely patch operations. These functions are more general, in a sense they can apply to the entire array(s). As a matter of fact, many of the Global Array collective operations are based on the patch operations, for instance, the ``GA_Printis`` only a special case of ``NGA_Print_patch``, called by setting the bounds of the patch to the entire global array. There are two interfaces for Fortran, one for two dimensional and the other for n-dimensional (one to seven). The (n-dimensional) interface can surely handle the two dimensional case as well. It is available for backward compatibility purposes. The functions dealing with n-dimensional patches use the ``nga``\ prefix and those dealing with two dimensional patches start with the ``ga`` prefix. The function - Fortran subroutine: `nga_zero_patch `__\ nga_zero_patch(g_a, alo, ahi) - C: void `NGA_Zero_patch `__\ (int g_a, int lo[] int hi[]) - C++: void GA::GlobalArray::zeroPatch(int lo[] int hi[]) is similar to *ga_zero*, except that instead of applying to entire array, it sets only the region defined by *lo* and *hi* to zero. One can assign a single value to all the elements in a patch with the function: - n-D Fortran subroutine: `nga_fill_patch `__\ (g_a, lo, hi, val) - 2-D Fortran subroutine: `ga_fill_patch `__\ (g_a, ilo, ihi, jlo, jhi, val) - C: void `NGA_Fill_patch `__\ (int g_a, int lo[], int hi[], void \*val) - C++: void GA::GlobalArray::fillPatch(int lo[], int hi[], void \*val) The\ ``lo`` and ``hi`` defines the patch and the ``val`` is the value to set. The function - n-D Fortran subroutine: `nga_scale_patch `__\ (g_a, lo, hi, val) - 2-D Fortran subroutine: `ga_scale_patch `__\ (g_a, ilo, ihi, jlo, jhi, val) - C void `NGA_Scale_patch `__\ (int g_a, int lo[] int hi[], void \*val) - C++: void GA::GlobalArray::scalePatch(int lo[], int hi[], void \*val) scales the patch defined by ``lo``\ and ``hi`` by the factor ``val``. The copy patch operation is one of the fundamental and frequently used functions. The function - n-D Fortran subroutine: `nga_copy_patch `__\ (trans, g_a, alo, ahi, g_b, blo, bhi) - 2-D Fortran subroutine: `ga_copy_patch `__\ (trans, g_a, ailo, aihi, ajlo, ajhi, g_b, bilo, bihi, bjlo, bjhi) - C: void `NGA_Copy_patch `__\ (char trans, int g_a, int alo[], int ahi[], int g_b, int blo[], int bhi[]) - C++: voidGA::GlobalArray::copyPatch(char trans, const GA::GlobalArray\* g_a, int alo[], int ahi[], int blo[], int bhi[]) copies one patch defined by ``alo`` and ``ahi`` in one global array ``g_ato`` another patch defined by ``blo`` and ``bhi`` in another global array ``g_b``. The current implementation requires that the source patch and destination patch must be on different global arrays. They must also be the same data type. The patches may be of different shapes, but the number of elements must be the same. During the process of copying, the transpose operation can be performed by specifying trans. *Example*: Assume that there two 8x6 Global Arrays, ``g_a`` and ``g_b``,distributed on three processes. The operation of ``nag_copy_patch``\ (Fortran notation), from :: g_a:alo={2,2},ahi={4,5} to g_b:blo={3,4},bhi={6,6} and trans = 0 involves reshaping. It is illustrated in the following figure. .. figure:: images/copy1.png :width: 80% One step further, if one also want to perform the transpose operation during the copying, i.e. set ``trans = 1``, it will look like: .. figure:: images/copy2.png :width: 80% If there is no reshaping or transpose, the operation can be fast (internally calling ``nga_put``). Otherwise, it would be slow (internally calling ``nga_scatter``, where extra time is spent on preparing the indices). Also note that extra memory is required to hold the indices if the operation involves reshaping or transpose. Linear Algebra -------------- Global arrays provide three linear algebra operations: addition, multiplication, and dot product. There are two sets of functions, one for the whole array and the other for the patches. .. _whole-arrays-1: Whole Arrays ~~~~~~~~~~~~ The function - Fortran subroutine: `ga_add `__\ (alpha, g_a, beta, g_b, g_c) - C: void `GA_Add `__\ (void \*alpha, int g_a, void \*beta,int g_b, int g_c) - C++ void GA::GlobalArray::add(void \*alpha, const GA::GlobalArray\* g_a, void \*beta, const GA::GlobalArray\* g_b) adds two arrays, ``g_a`` and ``g_b``, and saves the results to ``g_c``. The two source arrays can be scaled by certain factors. This operation requires the two source arrays have the same number of elements and the same data types, but the arrays can have different shapes or distributions.\ ``g_c`` can also be ``g_a`` or ``g_b``. It is encouraged to use this function when the two source arrays are identical in distributions and shapes, because of its efficiency. It would be less efficient if the two source arrays are different in distributions or shapes. Matrix multiplication operates on two matrices, therefore the array must be two dimensional. The function - Fortran subroutine: `ga_dgemm `__\ (transa, transb, m, n, k, alpha, g_a, g_b, beta, g_c ) - C: void `GA_Dgemm `__\ (char ta, char tb, int m, int n, int k, double alpha, int g_a, int g_b, double beta, int g_c ) - C++: void GA::GlobalArray::dgemm(char ta, char tb, int m, int n, int k, double alpha, const GA::GlobalArray\* g_a, const GA::GlobalArray\* g_b, double beta) Performs one of the matrix-matrix operations: *C := alpha\*op( A )\*op( B ) + beta\*C,* where op( X ) is one of *op( X ) = X or op( X ) = X',* alpha and beta are scalars, and *A*, *B,* and *C* are matrices, with *op( A )* an *m* by *k* matrix, *op( B )* a *k* by *n* matrix and *C* an *m* by *n* matrix. On entry, transa specifies the form of *op( A )* to be used in the matrix multiplication as follows: *ta = 'N'* or *'n', op( A ) = A*. *ta = 'T'* or *'t', op( A ) = A'*. The function - Fortran integer function: ga_idot(g_a, g_b) - double precision function `ga_ddot `__\ (g_a, g_b) - double complex function `ga_zdot `__\ (g_a, g_b) - C: long `GA_Idot `__\ (int g_a, int g_b) - double G\ `GA_Ddot `__\ A_Ddot(int g_a, int g_b) - DoubleComplex `GA_Zdot `__\ GA_Zdot(int g_a, int g_b) - C++: long GA::GlobalArray::idot(const GA::GlobalArray\* g_a) - double GA::GlobalArray::ddot(const GA::GlobalArray\* g_a) - DoubleComplex GA::GlobalArray::zdot(const GA::GlobalArray\* g_a) computes the element-wise dot product of two arrays. It is available as three separate functions, corresponding to *integer*, *double precision* and *double complex* data types. The following functions apply to the 2-dimensional whole arrays only. There are no corresponding functions for patch operations. The function - Fortran subroutine: `ga_symmetrize `__\ (g_a) - C: void `GA_Symmetrize `__\ (int g_a) - C++: void GA::GlobalArray::symmetrize() symmetrizes matrix A represented with handle ``g_a``:*A = .5 \* (A+A')*. The function - Fortran subroutine: `ga_transpose `__\ (g_a, g_b) - C: void `GA_Transpose `__\ (int g_a, int g_b) - C++: void GA::GlobalArray::transpose(const GA::GlobalArray\* g_a) transposes a matrix: B = A'. .. _patches-1: Patches ~~~~~~~ The functions - n-D Fortran subroutine: `nga_add_patch `__\ (alpha, g_a, alo, ahi, beta, g_b, blo, bhi, g_c, clo, chi) - 2-D Fortran subroutine: `ga_add_patch `__\ (alpha, g_a, ailo, aihi, ajlo, ajhi, beta, g_b, bilo, bihi, bjlo, bjhi, g_c, cilo, cihi, cjlo, cjhi) - C: void `NGA_Add_patch `__\ (void \*alpha, int g_a, int alo[], int ahi[], void \*beta, int g_b, int blo[], int bhi[], int g_c, int clo[], int chi[]) - C++: void GA::GlobalArray::addPatch(void \*alpha, const GA::GlobalArray\* g_a, int alo[], int ahi[], void \*beta, const GA::GlobalArray\* g_b, int blo[], int bhi[], int clo[], int chi[]) add element-wise two patches and save the results into another patch. Even though it supports the addition of two patches with different distributions or different shapes (the number of elements must be the same), the operation can be expensive, because there can be extra copies which effect memory consumption. The two source patches can be scaled by a factor for the addition. The function is smart enough to detect the case that the patches are exactly the same but the global arrays are different in shapes. It handles the case as if for the arrays were identically distributed, thus the performance will not suffer. The matrix multiplication is the only operation on array patches that is restricted to the two dimensional domain, because of its nature. It works for *double* and *double complex* data types. The prototype is - Fortran subroutine: `ga_matmul_patch `__\ (transa, transb, alpha, beta, g_a, ailo, aihi, ajlo, ajhi, g_b, bilo, bihi, bjlo, bjhi, g_c, cilo, cihi, cjlo, cjhi) - C: void `GA_Matmul_patch `__\ (char \*transa, char\* transb, void\* alpha, void \*beta, int g_a, int ailo, int aihi, int ajlo, int ajhi, int g_b, int bilo, int bihi, int bjlo, int bjhi, int g_c, int cilo, int cihi, int cjlo, int cjhi) - C++: void GA::GlobalArray::matmulPatch(char \*transa, char\* transb, void\* alpha, void \*beta, const GlobalArray \* g_a, int ailo, int aihi, int ajlo, int ajhi, const GlobalArray \* g_b, int bilo, int bihi, int bjlo, int bjhi, int cilo, int cihi, int cjlo, int cjhi) It performs :: C[cilo:cihi,cjlo:cjhi] := alpha* AA[ailo:aihi,ajlo:ajhi] * BB[bilo:bihi,bjlo:bjhi] + beta*C[cilo:cihi,cjlo:cjhi] where *AA = op(A), BB = op(B),* and *op( X )* is one of *X or X',* Valid values for transpose argument: *'n', 'N', 't', 'T'*. The dot operation computes the element-wise dot product of two (possibly transposed) patches. It is implemented as three separate functions, corresponding to integer, double precision and double complex data types. They are: - n-D Fortran integer function: nga_idot_patch(g_a, ta, alo, ahi, g_b, tb, blo, bhi) - double precision function `ga_ddot_patch `__ (g_a, ta, alo, ahi, g_b, tb, blo, bhi) - double complex function `ga_zdot_patch `__ (g_a, ta, alo, ahi, g_b, tb, blo, bhi) - 2-D Fortran integer function: ga_idot_patch(g_a, ta, ailo, aihi, ajlo, ailo, g_b, tb, bilo, bihi, bjlo, bjhi) - double precision function `ga_ddot_patch `_ (g_a, ta, ailo, aihi, ajlo, ailo, g_b, tb, bilo, bihi, bjlo, bjhi) - double complex function `ga_zdot_patch `__\ (g_a, ta, ailo, aihi, ajlo, ailo, g_b, tb, bilo, bihi, bjlo, bjhi) - C: Integer `NGA_Idot_patch `__\ (int g_a, char\* ta, int alo[], int ahi[], int g_b, char\* tb, int blo[], int bhi[]) - double `NGA_Ddot_patch `__\ (int g_a, char\* ta, int alo[], int ahi[], int g_b, char\* tb, int blo[], int bhi[]) - DoubleComplex `NGA_Zdot_patch `__\ (int g_a, char\* ta, int alo[], int ahi[], int g_b, char\* tb, int blo[], int bhi[]) - C++: IntegerGA::GlobalArray::idotPatch(const GA::GlobalArray\* g_a, char\* ta, int alo[], int ahi[], char\* tb, int blo[], int bhi[]) - double GA::GlobalArray::ddotPatch(const GA::GlobalArray\* g_a, char\* ta, int alo[], int ahi[], char\* tb, int blo[], int bhi[]) - DoubleComplex GA::GlobalArray::zdotPatch(const GA::GlobalArray\* g_a, char\* ta, int alo[], int ahi[], char\* tb, int blo[], int bhi[]) The patches should be of the same data types and have the same number of elements. Like the array addition, if the source patches have different distributions/shapes, or it requires transpose, the operation would be less efficient, because there could be extra copies and/or memory consumption. Element-wise operations ~~~~~~~~~~~~~~~~~~~~~~~ These operations work on individual array elements rather than arrays as matrices in the sense of linear algebra operations. For example multiplication of elements stored in arrays is a completely different operation than matrix multiplication. - Fortran subroutine: `ga_abs_value `__\ (g_a) - C: void `GA_Abs_value `__\ (int g_a) - C++: void GA::GlobalArray::absValue(int g_a) Take element-wise absolute value of the array. - Fortran subroutine: `ga_abs_value_patch `__\ (g_a, lo, hi) - C: void `GA_Abs_value_patch `__\ (int g_a, int lo[], int hi[]) - C++: void GA::GlobalArray::absValuePatch(int lo[], int hi[]) Take element-wise absolute value of the patch. - Fortran subroutine: `ga_add_constant `__\ (g_a, alpha) - C: void `GA_Add_constant `__\ (int g_a, void\* alpha) - C++: void GA::GlobalArray::addConstant(void\* alpha) Add the contant pointed by alpha to each element of the array. - Fortran subroutine: `ga_add_constant_patch `__\ (g_a, lo, hi, alpha) - C: void `GA_Add_constant_patch `__\ (int g_a, int lo[], int hi[], void\*alpha) - C++: void GA::GlobalArray::addConstantPatch(void\* alpha) Add the contant pointed by alpha to each element of the patch. - Fortran subroutine: `ga_recip `__\ (g_a) - C: void `GA_Recip `__\ (int g_a) - C++: void GA::GlobalArray::recip() Take element-wise reciprocal of the array. - Fortran subroutine: `ga_recip_patch `__\ (g_a, lo, hi) - C: void `GA_Recip_patch `__\ (int g_a, int lo[], int hi[]) - C++: void GA::GlobalArray::recipPatch(int lo[], int hi[]) Take element-wise reciprocal of the patch. - Fortran subroutine: `ga_elem_multiply `__\ (g_a, g_b, g_c) - C: void `GA_Elem_multiply `__\ (int g_a, int g_b, int g_c) - C++: void GA::GlobalArray::elemMultiply(const GA::GlobalArray \* g_a, const GA::GlobalArray \* g_b) Computes the element-wise product of the two arrays which must be of the same types and same number of elements. For two-dimensional arrays, c(i, j) = a(i,j)\*b(i,j) The result (c) may replace one of the input arrays (a/b). - Fortran subroutine: `ga_elem_multiply_patch `__\ (g_a, alo, ahi, g_b, blo, bhi, g_c, clo,chi) - C: void `GA_Elem_multiply_patch `__\ (int g_a, int alo[], int ahi[], int g_b, int blo[], int bhi[], int g_c, int clo[], int chi[]) - C++: void GA::GlobalArray::elemMultiplyPatch( const GA::GlobalArray \* g_a, int alo[], int ahi[], const GA::GlobalArray \* g_b, int blo[], int bhi[], int clo[], int chi[]) Computes the element-wise product of the two patches which must be of the same types and same number of elements. For two-dimensional arrays, c(i, j) = a(i,j)\*b(i,j) The result (c) may replace one of the input arrays (a/b). - Fortran subroutine: `ga_elem_divide `__\ (g_a, g_b, g_c) - C: void `GA_Elem_divide `__\ (Integer g_a, Integer g_b, Integer g_c) - C++: void GA::GlobalArray::elemDivide(const GA::GlobalArray \* g_a, const GA::GlobalArray \* g_b) Computes the element-wise quotient of the two arrays which must be of the same types and same number of elements. For two-dimensional arrays, c(i, j) = a(i,j)/b(i,j) The result (c) may replace one of the input arrays (a/b). If one of the elements of array g_b is zero, the quotient for the element of g_c will be set to GA_NEGATIVE_INFINITY. - Fortran subroutine: `ga_elem_divide_patch `__\ (g_a, alo, ahi, g_b, blo, bhi, g_c, clo, chi) - C: void `GA_Elem_divide_patch `__\ (int g_a, int alo[], int ahi[], int g_b, int blo[], int bhi[], int g_c, int clo[], int chi[]) - C++: void GA::GlobalArray::elemDividePatch( const GA::GlobalArray \* g_a, int alo[], int ahi[], const GA::GlobalArray \* g_b, int blo[], int bhi[], int clo[], int chi[]) Computes the element-wise quotient of the two patches which must be of the same types and same number of elements. For two-dimensional arrays, c(i, j) = a(i,j)/b(i,j) The result (c) may replace one of the input arrays (a/b). - Fortran subroutine: `ga_elem_maximum `__\ (g_a, g_b, g_c) - C: void `GA_Elem_maximum `__\ (Integer g_a, Integer g_b, Integer g_c) - C++: void GA::GlobalArray::elemMaximum(const GA::GlobalArray \* g_a, const GA::GlobalArray \* g_b) Computes the element-wise maximum of the two arrays which must be of the same types and same number of elements. For two dimensional arrays, c(i, j) = max{a(i,j), b(i,j)} The result (c) may replace one of the input arrays (a/b). - Fortran subroutine: `ga_elem_maximum_patch `__\ (g_a, alo, ahi, g_b, blo, bhi, g_c, clo, chi) - C: void `GA_Elem_maximum_patch `__\ (int g_a, int alo[], int ahi[], int g_b, int blo[], int bhi[], int g_c, int clo[], int chi[]) - C++: void GA::GlobalArray::elemMaximumPatch(const GA::GlobalArray \* g_a, int alo[], int ahi[], const GA::GlobalArray \* g_b, int blo[], int bhi[], int clo[], int chi[]) Computes the element-wise maximum of the two patches which must be of the same types and same number of elements. For two-dimensional of noncomplex arrays, c(i, j) = max{a(i,j), b(i,j)} If the data type is complex, then c(i, j).real = max{ \|a(i,j)|, \|b(i,j)|} while c(i,j).image = 0. The result (c) may replace one of the input arrays (a/b). - Fortran subroutine: `ga_elem_minimum `__\ (g_a, g_b, g_c) - C: void `GA_Elem_minimum `__\ (Integer g_a, Integer g_b, Integer g_c); - C++: void GA::GlobalArray::elemMinimum(const GA::GlobalArray \* g_a, const GA::GlobalArray \* g_b) Computes the element-wise minimum of the two arrays which must be of the same types and same number of elements. For two dimensional arrays, c(i, j) = min{a(i,j), b(i,j)} The result (c) may replace one of the input arrays (a/b). - Fortran subroutine: `ga_elem_minimum_patch `__\ (g_a, alo, ahi, g_b, blo, bhi, g_c, clo, chi) - C: void `GA_Elem_minimum_patch `__\ (int g_a, int alo[], int ahi[], int g_b, int blo[], int bhi[], int g_c, int clo[], int chi[]) - C++: void GA::GlobalArray::elemMinimumPatch (const GA::GlobalArray \* g_a, int alo[], int ahi[], const GA::GlobalArray \* g_b, int blo[], int bhi[], int clo[], int chi[]) Computes the element-wise minimum of the two patches which must be of the same types and same number of elements. For two-dimensional of noncomplex arrays, c(i, j) = min{a(i,j), b(i,j)} If the data type is complex, then c(i, j).real = min{ \|a(i,j)|, \|b(i,j)|} while c(i,j).image = 0. The result (c) may replace one of the input arrays (a/b). - Fortran subroutine: `ga_shift_diagonal `__\ (g_a, c) - C: void `GA_Shift_diagonal `__\ (int g_a, void \*c) - C++: void GA::GlobalArray::shiftDiagonal(void \*c) Adds this constant to the diagonal elements of the matrix. - Fortran subroutine: `ga_set_diagonal `__\ (g_a, g_v) - C: void `GA_Set_diagonal `__\ (int g_a, int g_v) - C++: void GA::GlobalArray::setDiagonal (const GA::GlobalArray \* g_v) Sets the diagonal elements of this matrix g_a with the elements of the vector g_v. - Fortran subroutine: `ga_zero_diagonal `__\ ( g_a) - C: void `GA_Zero_diagonal `__\ (int g_a) - C++: void GA::GlobalArray::zeroDiagonal() Sets the diagonal elements of this matrix g_a with zeros. - Fortran subroutine: `ga_add_diagonal `__\ (g_a, g_v) - C: void `GA_Add_diagonal `__\ (int g_a, int g_v) - C++: void GA::GlobalArray::addDiagonal(const GA::GlobalArray \* g_v) Adds the elements of the vector g_v to the diagonal of this matrix g_a. - Fortran subroutine: `ga_get_diag `__\ (g_a, g_v) - C: void `GA_Get_diag `__\ (int g_a, int g_v) - C++: void GA::GlobalArray::getDiagonal (const GA::GlobalArray \* g_v) Inserts the diagonal elements of this matrix g_a into the vector g_v. - Fortran subroutine: `ga_scale_rows `__\ ( g_a, g_v) - C: void `GA_Scale_rows `__\ (int g_a, int g_v) - C++: void GA::GlobalArray::scaleRows (const GA::GlobalArray \* g_v) Scales the rows of this matrix g_a using the vector g_v. - Fortran subroutine: `ga_scale_cols `__\ (g_a, g_v) - C: void `GA_Scale_cols `__\ (int g_a, int g_v) - C++: void GA::GlobalArray::scaleCols (const GA::GlobalArray \* g_v) Scales the columns of this matrix g_a using the vector g_v. - Fortran subroutine: `ga_norm1 `__\ (g_a, nm) - C: void `GA_Norm1 `__\ (int g_a, double \*nm) - C++: void GA::GlobalArray::norm1(double \*nm) Computes the 1-norm of the matrix or vector g_a. - Fortran subroutine: `ga_norm_infinity `__\ (g_a, nm) - C: void `GA_Norm_infinity `__\ (int g_a, double \*nm) - C++: void GA::GlobalArray::normInfinity(double \*nm) Computes the 1-norm of the matrix or vector g_a. - Fortran subroutine: `ga_median `__\ ( g_a, g_b, g_c, g_m) - C: void `GA_Median `__\ (int g_a, int g_b, int g_c, int g_m) - C++: void GA::GlobalArray::median(const GA::GlobalArray \* g_a, const GA::GlobalArray \* g_b, const GA::GlobalArray \* g_c) Computes the componentwise Median of three arrays ``g_a``, ``g_b``, and ``g_c``, and stores the result in this array ``g_m``. The result (m) may replace one of the input arrays (a/b/c). - Fortran subroutine: `ga_median_patch `__\ (g_a, alo, ahi, g_b, blo, bhi, g_c, clo, chi, g_m,mlo, mhi) - C: void `GA_Median_patch `__\ (int g_a, int alo[], int ahi[], int g_b, int blo[], int bhi[], int g_c, int clo[], int chi[], int g_m, int mlo[],int mhi[]) - C++: void GA::GlobalArray::medianPatch(const GA::GlobalArray\* g_a, int alo[], int ahi[], const GA::GlobalArray \* g_b, int blo[], int bhi[], const GA::GlobalArray \* g_c, int clo[], int chi[], int mlo[],int mhi[]) Computes the componentwise Median of three patches g_a, g_b, and g_c, and stores the result in this patch g_m. The result (m) may replace one of the input patches (a/b/c). - Fortran subroutine: `ga_step_max `__\ (g_a, g_b, step) - C: void `GA_Step_max `__\ (int g_a, int g_b, double \*step) - C++: void GA::GlobalArray::stepMax(const GA::GlobalArray \*g_a, double \*step) Calculates the largest multiple of a vector g_b that can be added to this vector g_a while keeping each element of this vector nonnegative. - Fortran subroutine: `ga_step_max2 `__\ ( g_xx, g_vv, g_xxll, g_xxuu, step2) - C: void `GA_Step_max2 `__\ (int g_xx, int g_vv, int g_xxll, int g_xxuu, double \*step2) - C++: void GA::GlobalArray::stepMax2(const GA::GlobalArray \*g_vv, const GA::GlobalArray \*g_xxll, const GA::GlobalArray \*g_xxuu, double \*step2) Calculates the largest step size that should be used in a projected bound line search. - Fortran subroutine: `ga_step_max_patch `__\ (g_a, alo, ahi, g_b, blo, bhi, step) - C: void `GA_Step_max_patch `__\ (int g_a, int \*alo, int \*ahi, int g_b, int \*blo, int \*bhi, double \*step) - C++: void GA::GlobalArray::stepMaxPatch(int \*alo, int \*ahi, const GA::GlobalArray \* g_b, int \*blo, int \*bhi, double \*step) Calculates the largest multiple of a vector g_b that can be added to this vector g_a while keeping each element of this vector nonnegative. - Fortran subroutine: `ga_step_max2_patch `__\ ( g_xx, xxlo, xxhi, g_vv,vvlo, vvhi, g_xxll, xxlllo, xxllhi, g_xxuu, xxuulo, xxuuhi, step2) - C: void `GA_Step_max2_patch `__\ (int g_xx, int \*xxlo, int \*xxhi, int g_vv, int \*vvlo, int \*vvhi, int g_xxll, int \*xxlllo, int \*xxllhi, int g_xxuu, int \*xxuulo, int \*xxuuhi, double \*step2) - C++: void GA::GlobalArray::stepMax2Patch(int \*xxlo, int \*xxhi, const GA::GlobalArray \* g_vv, int \*vvlo, int \*vvhi,const GA::GlobalArray \* g_xxll, int \*xxlllo, int \*xxllhi, const GA::GlobalArray \* g_xxuu, int \*xxuulo, int \*xxuuhi, double \*step2) Calculates the largest step size that should be used in a projected bound line search. Interfaces to Third Party Software Packages ------------------------------------------- There are many existing software packages designed for solving engineering problems. They are specialized in one or two problem domains, such as solving linear systems, eigen-vectors, and differential equations, etc. Global Arrays provide interfaces to several of these packages. Scalapack ~~~~~~~~~ `Scalapack `__ is a well known software library for linear algebra computations on distributed memory computers. Global Arrays uses this library to solve systems of linear equations and also to invert matrices. The function - Fortran integer function: `ga_solve `__\ (g_a, g_b) - C: int `GA_Solve `__\ (int g_a, int g_b) - C++: int GA::GlobalArray::solve(const GA::GlobalArray \* g_a) solves a system of linear equations *A \* X = B*. It first will call the Cholesky factorization routine and, if successful, will solve the system with the Cholesky solver. If Cholesky is not able to factorizeA, then it will call the LU factorization routine and will solve the system with forward/backward substitution. On exit *B* will contain the solution *X*. The function - Fortran integer function: `ga_llt_solve `__\ (g_a, g_b) - C: int `GA_Llt_solve `__\ (int g_a, int g_b) - C++: int GA::GlobalArray::lltSolve(const GA::GlobalArray \* g_a) also solves a system of linear equations *A \* X = B*, using the Cholesky factorization of an *NxN* double precision symmetric positive definite matrix *A* (handle ``g_a``). On successful exit *B* will contain the solution *X*. The function - Fortran subroutine: `ga_lu_solve `__\ (trans, g_a, g_b) - C: void `GA_Lu_solve `__\ (char trans, int g_a, int g_b) - C++: void GA::GlobalArray::luSolve(char trans, const GA::GlobalArray \* g_a) solves the system of linear equations *op(A)X = B* based on the LU factorization. *op(A) = A or A'* depending on the parameter ``trans``. Matrix *A* is a general real matrix. Matrix *B* contains possibly multiple *rhs* vectors. The array associated with the handle ``g_b`` is overwritten by the solution matrix *X*. The function - Fortran integer function: `ga_spd_invert `__\ (g_a) - C: int `GA_Spd_invert `__\ (int g_a) - C++: int GA::GlobalArray::spdInvert() computes the inverse of a double precision matrix using the Cholesky factorization of a *NxN* double precision symmetric positive definite matrix *A* stored in the global array represented by ``g_a``. On successful exit, *A* will contain the inverse. PeIGS ~~~~~ The PeIGS library contains subroutines for solving standard and generalized real symmetric eigensystems. All eigenvalues and eigenvectors can be computed. The library is implemented using a message-passing model and is portable across many platforms. Global Arrays use this library to solve eigenvalue problems. The function - Fortran subroutine: `ga_diag `__\ (ga_diag(g_a, g_s, g_v, eval) - C: void `GA_Diag `__\ (int g_a, int g_s, int g_v, void \*eval) - C++: void GA::GlobalArray::diag (const GA::GlobalArray\*g_s, const GA::GlobalArray\* g_v, void \*eval) solves the generalized eigenvalue problem returning all eigenvectors and values in ascending order. The input matrices are not overwritten or destroyed. The function - Fortran subroutine: `ga_diag_reuse `__ (control, g_a, g_s, g_v, eval) - C: void `GA_Diag_reuse `__ (int control, int g_a, int g_s, int g_v,void \*eval) - C++: void GA::GlobalArray::diagReuse (int control, const GA::GlobalArray\* g_s, const GA::GlobalArray\*g_v, void \*eval) solves the generalized eigen-value problem returning all eigenvectors and values in ascending order. Recommended for REPEATED calls if ``g_s`` is unchanged. The function - Fortran subroutine: `ga_diag_std `__\ (g_a, g_v, eval) - C: void `GA_Diag_std `__\ (int g_a, int g_v, void \*eval) - C++: void GA::GlobalArray::diagStd( const GA::GlobalArray\* g_v, void \*eval) solves the standard (non-generalized) eigenvalue problem returning all eigenvectors and values in the ascending order. The input matrix is neither overwritten nor destroyed. Interoperability with Others ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Global Arrays are interoperable with several other libraries, but do not provide direct interfaces for them. For example, one can make calls to and link with these libraries: `PETSc `__ (the Portable, Extensible Toolkit for Scientific Computation) is developed by `Argonne National Laboratory `__. PETSc is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations. It employs the MPI standard for all message-passing communication, and is written in a data-structure-neutral manner to enable easy reuse and flexibility. Here are the `instructions `__\ for using PETSc with GA. `CUMULVS `__\ (Collaborative User Migration User Library for Visualization and Steering) is developed by the `Oak Ridge National Laboratory `__. CUMULVS is a software framework that enables programmers to incorporate fault-tolerance, interactive visualization and computational steering into existing parallel programs. Here are the\ `instructions `__ for using CUMULVS with GA. Synchronization Control in Collective Operations ------------------------------------------------ GA collective array operations are implemented by exploiting locality information to minimize or even completely avoid interprocessor communication or data copying. Before each processor accesses its own portion of the GA data we must assure that the data is in a consistent state. That means that there are no outstanding communication operations targeting that given global array portion pending while the data owner is accessing it. To accomplish that the GA collective array operations have implicit synchronization points: at the beginning and at the end of the operation. However, in many cases when collective array operations are called back-to-back or if the user does an explicit sync just before a collective array operation, some of the internal synchronization points could be merged or even removed if user can guarantee that the global array data is in the consistent state. The library offers a call for the user to eliminate the redundant synchronization points based on his/her knowledge of the application. The function - Fortran subroutine: `ga_mask_sync `__\ (prior_sync_mask,post_sync_mask) - C: void `GA_Mask_sync `__\ (int prior_sync_mask,int post_sync_mask) - C++: void GA::GlobalArray::maskSync(int prior_sync_mask,int post_sync_mask) This operation should be used with a lot of care and only when the application code has been debugged and the user wishes to tune its performance. Making a call to this function with prior_sync_mask parameter set to false disables the synchronization done at the beginning of first collective array operation called after a call to this function. Similarly, making a call to this function by setting the post_sync_mask parameter to false disables the synchronization done at the ending of the first collective array operation called after a call to this function.