[feat] data augmentation
Closed this issue · 2 comments
mathysgrapotte commented
We should be able to augment data (as opposed to noise data) as a way to allow for constructing better datasets (should improve training vs decrease training power).
Data augmentation should only happen on the train set
Data augmentation should happen after noise and after data has been joined (in the event of column splitting)
- add an add_augmentation method to the csv class in python that adds data (duplicates).
- add augmentation methods in augmentation/augmentation_generators.py
- check if the experiment class can handle data augmentation
- write tests for both experiment and augmentation classes
- allow json schemer to handle user input augmentation method
alessiovignoli commented
nextflow part is completed in #118