Pressio/rom-tools-and-workflows

CompositeVectorSpace

Closed this issue ยท 15 comments

This is in part motivated by #97

We presently have a splitter which enables the construction of block-type vector spaces with individual DOFs. A much more elegant solution to this is the notion of a CompositeVectorSpace which is a class that takes in a list of vector spaces and stitches them together. This:

  1. Eliminates the need for the splitter, which is helpful.
  2. Is more efficient in terms of CPU time since we aren't computing SVDs on block matrices, but instead on the individual blocks
  3. gives us the flexibility to store the composite vector space more efficiently if desired, which is very helpful downstream for hyper-reduction.

My first pass is:

class CompositeVectorSpace:
    '''
    Constructs a composite vector space out of a list of vector spaces
    Different vector spaces need to have the same number of spatial DOFs
    '''

    def __init__(self,list_of_vector_spaces : list[VectorSpace], variable_ordering=None, compact_basis_representation=False):
        '''
        Inputs: list_of_vector_spaces: list[VectorSpace] containing the list of vector spaces to combine
                variable_ordering: Either None or np.ndarray of integers. Specifies the desired variable ordering.
                compact_basis_representation: bool. If True, we compute a global basis. If false, we only store a list of bases
        '''

        self.__compact_basis_representation = compact_basis_representation
        self.list_of_vector_spaces_ = list_of_vector_spaces
        ## Computed dimensions and ensure vector spaces are compatable 
        dims = np.zeros(list_of_vector_spaces.size)
        self.n_vector_spaces_ = len(list_of_vector_spaces)
        n_vars = 0
        total_number_of_bases = 0
        for i in range(0,self.n_vector_spaces_):
            local_vector_space = list_of_vector_spaces[i]
            local_vector_space_dimensions = local_shift_vector.get_dimension()
            n_vars += local_vector_space_dimensions[0]
            dims[i] = local_vector_space_dimensions[1]
            total_number_of_bases += local_vector_space_dimensions[2]

        self.n_vars_ = n_vars
        self.nx_ = dims[0]
        self.total_number_of_bases_ = total_number_of_bases

        assert np.allclose( np.diff(dims),np.zeros(dims.size-1)) , " Error constructing composite vector space, not all spaces have the same spatial dimension "

        if variable_ordering == None:
            variable_ordering = np.arange(0,n_vars,dtype='int')
        else:
            ## Check that we have a valid variable ordering
            variables_left = np.arange(0,n_vars,dtype='int')
            assert(len(variable_ordering) == n_vars)
            for variable in variable_ordering:
                assert(np.isin(variable,variables_left))
                variables_left = np.delete(variables_left[np.where(variable==variables_left))

        ## Construct a global shift vector
        self.__shift_vector = list_of_vector_spaces[0].get_shift_vector()
        for i in range(1,self.n_vector_spaces_):
          local_shift_vector = list_of_vector_spaces[i].get_shift_vector()
          self.__shift_vector = np.append(self.shift_vector_,local_shift_vector())

        # Re-order according to variable ordering
        self.__shift_vector = self.__shift_vector[variable_ordering]

        ## Compute basis
        if compact_basis_representation == False:
            # If False, return a global array
            self.__basis = np.zeros((self.n_vars_,self.nx_,self.total_number_of_bases_))
            start_var_index = 0
            start_bases_index = 0
            for i in range(0,self.n_vector_spaces_):
                local_vector_space = self.list_of_vector_spaces_[i]
                local_vector_space_dimensions = local_shift_vector.get_dimension()
                local_basis = local_vector_space.get_basis()
                dim = local_b
                self.__basis[start_var_index:start_var_index+dim[0],:,start_basis_index:start_basis_index+dim[2]] = local_basis
                start_var_index += dim[0]
                start_basis_index += dim[2]
           self.__basis = self.__basis[variable_ordering]

        else:
            # If true, return a list of the bases.
            # This is much more efficient in terms of memory 
            self.__compact_basis = []
            for i in range(0,self.n_vector_spaces_):
                self.__compact_basis.append(self.list_of_vector_spaces_[i].get_basis())

    def get_shift_vector(self) -> np.ndarray:
        return self.shift_vector_


    def get_basis(self) -> np.ndarray:
        if self.__compact_basis_representation:
            print('Error, CompositeVectorSpace was constructed with a compact basis representation and does not have a direct global basis. Use get_compact_basis')
            sys.exit()
        else:
            return self.__basis

    def get_compact_basis(self) -> list[np.ndarray]:
        if self.__compact_basis_representation:
            return self.__compact_basis
        else:
            print('Error, CompositeVectorSpace was constructed with a full basis representation and does not have a compact basis representation. Use get_basis')
            sys.exit()

@fnrizzi @pjb236 @jtencer @ekrathSNL Thoughts?

Eliminates the need for the splitter, which is helpful.

why?

Eliminates the need for the splitter, which is helpful.

why?

@fnrizzi It's a somewhat complicated piece of code that isn't intuitive to use.

no i mean why it eliminates it? I fully agree that the splitter is obnoxious :)

@fnrizzi Ah, sorry. Now, instead of defining a "split" vector space using our current implementation w/ the splitter, you will define multiple vector spaces according to how you want to split. So the use changes from:

#snapshots = np.ndarray of size (2,nx,n_s)
my_splitter = BlockSplitter([[1],[2]],2)
split_vector_space = VectorSpace(snapshots,...,my_splitter)

to:

#snapshots = np.ndarray of size (2,nx,n_s)
v1 = VectorSpace(snapshots[0:1])
v2 = VectorSpace(snapshots[1:2])
composite_vector_space = CompositeVectorSpace([v1,v2])

I think the latter case is more intuitive, more elegant and clean, and is also computationally more efficient.

I really like the change from splitter to CompositeVectorSpace. Even without documentation and just a use case example, it makes more sense than splitter did.

I agree that this is a good change. I think we should make CompositeVectorSpace meet the VectorSpace API though (with some added functions to take advantage of the special structure if desired). That way we could still pass such a thing into all the workflow stuff without modification.

@jtencer Agree. The one fine-grained aspect I'd like to address is the ability to not store the sparse global basis, and instead store things in list (e.g., the "compact" representation in the branch I pushed). I'm not sure what the best way to handle this downstream is, but I think it's important to have this from a performance point of view.

I'd suggest storing it as compact but then having get_basis() either construct the sparse global basis (with an alternate function that returns the compact version) or having an optional use_compact argument to get_basis() that defaults to False. That way if you don't need it downstream, the global basis never gets constructed but if your downstream stuff isn't smart enough to use the compact representation, everything still works (just with a memory hit that you may or may not care about for your application)

Thinking about it more, I think having separate get_basis() and get_compact_basis() functions makes the most sense because then a downstream task could do something like

if hasattr(vector_space, 'get_compact_basis'):
    # do compact basis stuff
else:
    # do normal stuff

A final thought: should we leverage something like scipy.sparse.block_diag for the compact basis?

@jtencer Yeah, that was kinda my thought. This is what I have right now:

https://github.com/Pressio/rom-tools-and-workflows/blob/141-composite-vector-space/romtools/composite_vector_space/__init__.py (lines 133 on show the general API). I think this is not bad; the "composite_representation" is passed on initialization. I like the idea of storing in a sparse matrix for the global basis. I wonder if there is nice numpy representations for block matrices. Probably this: https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.block_diag.html

@jtencer Unfortunately block_diag won't work for our tensor representations.

I found this https://numpy.org/doc/stable/user/basics.dispatch.html#basics-dispatch but it might be more trouble than it's worth.

@jtencer this is really cool, I had never seen this. I vote to not implement this now, but keep it as future work.