jku-vds-lab/paradime

Change how relations are specified in the ParametricDR class

Closed this issue · 2 comments

Distinguish between

  • "global" relations that are applied to the whole dataset (e.g., perplexity-based stuff in t-SNE); and
  • "local" relations that are calculated on-the-fly/batch-wise.

The former should be defined once in the beginning (but still have to be accessible by the losses later). The latter should be specified for each training phase. This should make use of the new TrainingPhase class or the equivalent add_training_phase method. To allow for multiple global relations, use the same key value syntax as already used in the batch accessing.

Users should be able to pass a single global and/or local relation, which will be assigned a default key. The default key is used as a default value for one of the keyword arguments in the pre-defined loss functions (so it can be overidden easily).

If users want to use the same batch-wise relations across multiple phases, this would be possible through the set_training_defaults method, which could work but isn't elegant.

An alternative is to specify relations "globally", but not have that mean that they are applied to the whole dataset. So these relations could later be used as well. I don't really like this idea because it would require a method that "prepares" a subset of the global relations, and a way to specify which relations are to be prepared.

Another possibility is to require that the user sets all relations in the beginning under "global" and "batch-wise". Then only the loss functions would have to specify which ones to take on a phase-level. This is probably the most promising way to do it, as it makes it most clear which ones are calculated straight away and which ones are deferred, while still making things reusable throughout phases.