nf-core/deepmodeloptim

[feat] data augmentation

Closed this issue · 2 comments

We should be able to augment data (as opposed to noise data) as a way to allow for constructing better datasets (should improve training vs decrease training power).

Data augmentation should only happen on the train set
Data augmentation should happen after noise and after data has been joined (in the event of column splitting)

  • add an add_augmentation method to the csv class in python that adds data (duplicates).
  • add augmentation methods in augmentation/augmentation_generators.py
  • check if the experiment class can handle data augmentation
  • write tests for both experiment and augmentation classes
  • allow json schemer to handle user input augmentation method

Python part is completed in #96

nextflow part is completed in #118