/RegularizedEnergyFunctions

Code and results for generating regularized energy functions obtained from observed sequences of data.

Primary LanguagePython

RegularizedEnergyFunctions

This project explores an effective way to create regularized energy functions which are a type of self supervised learned manifolds for observed representations. An energy function shows the compatibility score between two representations, where the two representations together are an N-dimensional point. The energy at that point is zero if the representations are compatibile, E.G. they have been observed sequentially together, and otherwise the energy is greater than zero if not compatible or have not been observed. Ideally the energy function should be smooth so that its easy to navigate its space through gradient search. The regularized part of the energy function means that it only requires positive examples or obersved sequences of representations to learn the energy mapping and all other non observed representional pair points automatically get non-zero energy. Also the regularization should constrain the curvature so that it is smooth and not jagged.

An example of a perfect energy function for pairs of (x,y) values from X(t) = -cos(-2t) + cos(t), Y(t) = -sin(-2t) + sin(t). Interactive graph also avaliable here https://www.desmos.com/calculator/gddgywduza

In this project to achieve the goal of creating a regularized energy function a variational autoencoder is used. The idea is the VAE takes in observed points from the function and tries to have its encoder/decoder pair recreate the same points, hence the autoencoder part of VAE. What prevents collapse of the model though, or non seen points outside of the function from also having zero energy is the variational part which is a KL regularization. The regularization is trying to make it so the encoder value only has values that would lie on the observed data manifold. That means the encoder values can only produce points that would lie on the function once passed through the decoder. With this any point outside the function would be projected onto the (curve)representation space and then the distance between the point and its projection is the energy. If the point already lies on the function then so would its projection and the distance/energy would be zero.

Below is an animation showing the learned process using the VAE as the energy function targeting a slightly rotated version of the function above. The energy function is not perfect, it has some phantom limbs(zero energy values not on the function) around the boundary and some small gaps in the boundary. The function is remarkably smooth though, and achieves by far the best looking results out of all the methods that I have seen or tried including some state of the art such as BYOL which actually didn't work very well for this task at all despite much effort.