Implementations

  • AL Strategies: ['EN', 'MS', 'LC', 'EN-CLU', 'MS-CLU', 'LC-CLU', 'RBE', 'DBE', 'MST-BE', 'MST-CLU-DS', 'MST-CLU-DDE', 'RDS', 'MST-CLU-RDS', 'MST-CLU-RDS2']
  • Classifiers: ['SVM', 'k-NN', 'RF', 'NB']
  • Datasets: ['LEA-53']

Installation

dhaaActiveLearning requires Python >= 3.5

  • numpy
  • scipy
  • scikit-learn
  • tqdm
  • pandas
  • googledrivedownloader
  • modAL

You can install directly with pip:

pip3 install git+https://github.com/dhaalves/dhaaActiveLearning.git

Usage

First, you need a folder named 'datasets' which, for each dataset, must contain at least 2 CSV files (features, labels) respecting the following naming convention:

  • features: '<dataset_name>_features.csv' required
  • labels: '<dataset_name>_labels.csv' required
  • filenames: '<dataset_name>_filenames.csv' optional

You can check an example dataset under 'datasets' folder of this repository.

After that, you can run the following example (example.py):

import numpy as np

import dhaaActiveLearning
from dhaaActiveLearning import AL_Strategy, AL_Parameters
from dhaaActiveLearning.classification import Classifier
from dhaaActiveLearning.dataset import Dataset

if __name__ == '__main__':
    np.random.seed(1) #for reproducibility on some al strategies

    print('AL Strategies:', AL_Strategy.get_names())
    print('Classifiers:', Classifier.get_names())
    print('Datasets:', Dataset.get_names())

    al_params = AL_Parameters(dataset_name='LEA-53', classifier_name='RF', strategy_name='MS', max_iterations=20)
    results = dhaaActiveLearning.run(al_params=al_params, n_splits=1)
    results.save('LEA-53-results')