Oufattole/meds-torch

Noise Augmentation Stage

Opened this issue · 1 comments

We can add stages that add noise based on TS-TCC.

We can support different kinds of noise augmentations based on the TS-TCC model.

This paper defines a pretraining task for unlabeled time-series data based on applying augmentations to the data and applying simclr to them. It uses two types of data augmentation, termed as "strong" and "weak" augmentations, to create different views of the data for the learning process.

Strong Augmentation: Applies more intense modifications to the data, which may include drastic changes like shuffling parts of the data sequence, adding significant noise, or other alterations that substantially change the data's original structure.

Weak Augmentation: Involves less intrusive changes such as slight jittering or scaling. These augmentations retain more of the data's original characteristics compared to strong augmentations.

The pretraining task is essentially predicting future embeddings using an autoregressive model while aligning representations derived from both weakly and strongly transformed data (simclr loss is used I believe). It will require significant modifications to work on categorical data however, we probably can do additions of gaussian noise along with locally shuffling the order of events or tokens.

We should create an API update to allow arbitrary intermediate stages like this noise augmentation or masking data