This test.py
implements a simple multi-layer perceptron neural network(MLP) or full-connected network(FC). This kind of net is the basic and simplest net. At each layer, it goes like this:
$$
\textbf{Input: }\mathbf{x}\in[\text{batch},\text{input dimension}]; \ \textbf{Weight: } \mathbf{w} \in [\text{input dimension},\text{output dimension}] \
\mathbf{y} = activation(\mathbf{w}\cdot\mathbf{x})
$$
So, the init api of this class should contain the shape of each layer, and activations of each layer.
**Import **code from test.py
, and implement two dimension transformations MLP of $2828\rightarrow50\rightarrow1$ and $1\rightarrow 10\rightarrow 2828$, note that this kind of transformation is what a GAN is doing.
Here a bijective function is a function that map a set
Implement this bijective function: $$ \begin{split} &&\mathbf{F} = \mathbf{f}_0 \circ \mathbf{f}_1 \ &\text{where } &\mathbf{f}_0(\mathbf{x},\mathbf{y}) = (\exp(\mathbf{sv}(\mathbf{y}))\mathbf{x}+\mathbf{v}(\mathbf{y}),\mathbf{y}) \ &&\mathbf{f}_1(\mathbf{x},\mathbf{y}) = (\mathbf{x},\exp(\mathbf{su}(\mathbf{x}))\mathbf{y}+\mathbf{u}(\mathbf{x})) \ &&\mathbf{v},\mathbf{u},\mathbf{sv},\mathbf{su}: \mathbb{R}^{\frac{n}{2}} \rightarrow \mathbb{R}^{\frac{n}{2}} ; \ &&\mathbf{x},\mathbf{y}\in \mathbb{R}^{\frac{n}{2}} \end{split} $$
Write a class that has two method: sample; logProbability.
This class take a shape list as input for init method.
Sample take one parameter: batchSize, and return a variables from Gaussian distribution of shape [batchsize, shape].
logProbability take one parameter: a pytorch variable of shape [batchsize, shape] and return the logprobability of each sample, so this is of shape [batchsize].
Deduce the Jacobian of a NICE transformation here: $$ jacobian_0=[\begin{smallmatrix}I&v\0&I\end{smallmatrix}]\ jacobian_1=[\begin{smallmatrix}I&0\u&I\end{smallmatrix}]\ $$ Deduce the Jacobian of a RealNVP transformation here: $$ jacobian_0=[\begin{smallmatrix}e^{sv(y)}&xe^{sv(y)}sv+v\0&I\end{smallmatrix}]\ jacobian_1=[\begin{smallmatrix}I&0\ye^{su(x)}su+u&e^{su(x)}\end{smallmatrix}]\ $$
Now, when we init the NICE
or RealNVP
, we take another parameter as the prior. So this transforamtion is a transformation of probability distribution, it transforamtion the prior distribution to a distribution we want. So sample
method will draw samples from the transformed distribution. And logProbability
method will give the log probabilitys of a batch of given samples.
Write a script to download and unzip MNIST data.
Add a main.py
and training realnvp on MNIST model, the process is like this: random draw a batch of MNIST data and let realnvp give the probabilitys of every samples from this batch, and mean this batch of probability and let negative this mean as loss and do gradients descent.