aldro61/pyscm

Filtering of meta-attributes based on their cardinality

aldro61 opened this issue · 0 comments

This is currently implemented in MetaSetCoveringMachine._get_binary_attribute_utilities, but we filter the attributes after computing their utility. This cannot be optimized if the data resides in an HDF5 file, but it can be if the data is in a numpy in RAM. In this case, we could filter the columns of the attribute classification matrix in the MetaSetCoveringMachine.fit function. We would save the computation of the utility function for meta-attributes that will never be selected.