edouardelasalles/stnn

How to understand Theta0 and Theta1?

Einsturing opened this issue · 2 comments

image
For example, in this formula, the parameters which need to be optimized are d, Z, Theta0, Theta1.
In the code, I can find the params list, in which model.factors_parameters() is Z, model.decoder.parameters() is d. That means model.dynamic.parameters() is Theta0 and Theta1, but I can't understand how the Thetas participate in the training.
The function
image
show the method to calculate the content of h(), in which I can find Z_t and W, I guess z_context is correspond to Theta, but I don't know why should it calculate like this.
If you see the issue, can you give me some instructions? I will be very grateful!

Hi, thank you for your interest in our work!

This code is made to efficiently performs several matrix multiplications at once.

First not that Z_t Theta_0 + W Z_t Theta_1 can be rewritten as W_Id Z_t Theta_0 + W Z_t Theta_1, where W_Id is the identity matrix. In the variables rels we put both W and W_Id . z_context contains the concatenated result of both W_Id Z_t and W Z_t. We then pass this variable to self.dynamic, a torch module that contains Theta_0 and Theta_1 and that performs the linear operation for both matrices at once.

So Theta_0 and Theta_1 are the parameters of the self.dynamic module.

I hope this answer can clarified the code. Don't hesitate to ask further questions if needed!

Your answer has solved my problem very well. Thank you very much for this!