A scalable two-level co-clustering algorithm
"A scalable two-level co-clustering algorithm"
Marius Bartcus, Marc Boullé, Fabrice Clérot
The source code contains the following python files:
- script_test_2LKHC.py
- datasets.py
- TwoLevelCoclustering.py
- constants.py
- KHCUtils.py
- KhiopsCoclustering.py
- SubDataSet.py
- KHChoosePartitionSize.py
- CoclusteringModel.py
- khiopsStats.py
- utils.py
- DataUtils.py
to obtain a detailed information on each of theese source code:
import file_name.py as f
help(f)
these gives information on the selected source code with information on classes and functions each function and class has it's comments that will be given by "help(f)"
- script_test_2LKHC.py The main script that runs the optimize two level coclustering
- datasets.py The data sets setting names
- TwoLevelCoclustering.py The main two level coclustering algorithm implementation
- constants.py settings to run the algorithm
- KHCUtils.py routines for Khiops to run KHCp, KHCc, KHCe also creates Khiops scripts
- KhiopsCoclustering.py routine to run Khiops coclustering
- SubDataSet.py class to stock the subdata set information
- KHChoosePartitionSize.py routine to compute the number of parts for first variable and for the second variable
- CoclusteringModel.py parse the coclustering model
- khiopsStats.py routines MODL
- utils.py routines python: file reading, writing files, compute MODL, python dictionaries,list manipulations, etc
- DataUtils.py Generate artificial data sets