VectorSpace

class vectorspace.VectorSpaceHandles(inner_product=None, max_vecs_per_node=None, verbosity=1, print_interval=10)[source]

Responsible for performing addition and multiplication in parallel.

Kwargs:

inner_product: Inner product function.

max_vecs_per_node: Max number of vecs in memory per node.

verbosity: 1 prints progress and warnings, 0 prints almost nothing.

print_interval: Min time (secs) between printed progress messages.

The class implements parallelized vector addition and scalar multiplication and is used in high-level classes in pod, bpod, dmd and ltigalerkinproj.

Note: Computations are often sped up by using all available processors, even if this lowers max_vecs_per_node proportionally. However, this depends on the computer and the nature of the functions supplied, and sometimes loading from file is slower with more processors.

compute_inner_product_mat(row_vec_handles, col_vec_handles)[source]

Computes the matrix of inner product combinations between vectors.

Args:
row_vec_handles: List of row vector handles.
For example BPOD adjoints, Y.
col_vec_handles: List of column vector handles.
For example BPOD directs, X.
Returns:
IP_mat: 2D array of inner products.

The vecs are retrieved in memory-efficient chunks and are not all in memory at once. The row vecs and col vecs are assumed to be different. When they are the same, use compute_symmetric_inner_product() for a 2x speedup.

Each MPI worker (processor) is responsible for retrieving a subset of the rows and columns. The processors then send/recv columns via MPI so they can be used to compute all IPs for the rows on each MPI worker. This is repeated until all MPI workers are done with all of their row chunks. If there are 2 processors:

      | x o |
rank0 | x o |
      | x o |
  -
      | o x |
rank1 | o x |
      | o x |

In the next step, rank 0 sends column 0 to rank 1 and rank 1 sends column 1 to rank 0. The remaining IPs are filled in:

      | x x |
rank0 | x x |
      | x x |
  -
      | x x |
rank1 | x x |
      | x x |

When the number of cols and rows is not divisible by the number of processors, the processors are assigned unequal numbers of tasks. However, all processors are always part of the passing cycle.

The scaling is:

  • num gets / processor ~ (n_r*n_c/((max-2)*n_p*n_p)) + n_r/n_p
  • num MPI sends / processor ~ (n_p-1)*(n_r/((max-2)*n_p))*n_c/n_p
  • num inner products / processor ~ n_r*n_c/n_p

where n_r is number of rows, n_c number of columns, max is max_vecs_per_proc = max_vecs_per_node/num_procs_per_node, and n_p is the number of MPI workers (processors).

If there are more rows than columns, then an internal transpose and un-transpose is performed to improve efficiency (since n_c only appears in the scaling in the quadratic term).

compute_symmetric_inner_product_mat(vec_handles)[source]

Computes an upper-triangular symmetric matrix of inner products.

Args:
vec_handles: List of vector handles.
Returns:
IP_mat: Numpy array of inner products.

See the documentation for compute_inner_product_mat() for an idea how this works.

TODO: JON, write detailed documentation similar to compute_inner_product_mat().

lin_combine(sum_vec_handles, basis_vec_handles, coeff_mat, coeff_mat_col_indices=None)[source]

Linearly combines the basis vecs and calls put on result.

Args:

sum_vec_handles: List of handles for the sum vectors.

basis_vec_handles: List of handles for the basis vecs.

coeff_mat: Matrix with rows corresponding to a basis vecs
and columns to sum (lin. comb.) vecs. The rows and columns correspond, by index, to the lists basis_vec_handles and sum_vec_handles. sums = basis * coeff_mat
Kwargs:
coeff_mat_col_indices: List of column indices.
The sum_vecs corresponding to these col indices are computed.

Each processor retrieves a subset of the basis vecs to compute as many outputs as a processor can have in memory at once. Each processor computes the “layers” from the basis it is resonsible for, and for as many modes as it can fit in memory. The layers from all procs are summed together to form the sum_vecs and put ed.

Scaling is:

num gets/worker = n_s/(n_p*(max-2)) * n_b/n_p

passes/worker = (n_p-1) * n_s/(n_p*(max-2)) * (n_b/n_p)

scalar multiplies/worker = n_s*n_b/n_p

Where n_s is number of sum vecs, n_b is number of basis vecs, n_p is number of processors, max = max_vecs_per_node.

print_msg(msg, output_channel=<open file '<stdout>', mode 'w' at 0x10028b1e0>)[source]

Print a message from rank 0.

sanity_check(test_vec_handle)[source]

Check user-supplied vec handle and vec objects.

Args:
test_vec_handle: A vector handle.

The add and mult functions are tested for the vector object. This is not a complete testing, but catches some common mistakes. Raises an error if a check fails.

TODO: Other things which could be tested:
get/put doesn’t effect other vecs (memory problems)
class vectorspace.VectorSpaceMatrices(weights=None)[source]

Inner products and linear combinations with matrices.

Kwargs:
inner_product_weights: 1D array or matrix of inner product weights.
It corresponds to W in inner product v_1^* W v_2.

Previous topic

Module vectors

Next topic

Parallel Class

This Page