Regarding the difference between the TP-GMM implementation using MATLAB and Python
Reichenbar opened this issue · 3 comments
Dear authors,
Hello, thanks for your great work. I am working on the improvement of TP-GMM proposed by Sylvain Calinon. I need to implement it using Python and find the Rofunc package useful. I have several questions regarding the TP-GMM implementation.
- I find you and Sylvain both implement TP-GMM using HMM, instead of GMM, which is not consistent with its definition and Matlab version. I know HMM definitely works here, but can you explain why to use it rather than GMM?
- You use the marginal_model method to get the local GMM model in each frame. Looking at its definition, we can find the local model would have the same priors as the original model. But when the original HMM model is trained through EM, the priors are not updated and only the init_priors are updated. So I think the priors of local GMM models are not correct in this way.
- In the reproduce method, why do you use LQR to estimate, instead of directly using GMR? The same question applies to the uni method of your TP-GMR implementation.
I initially referred to the Matlab version of TP-GMM provided by Sylvain. It is more detailed and consistent with the description in his papers. I do not understand why the Python implementation is a little different from Matlab. I find the Rofunc package keeps this difference due to the use of the pbd package. I am confused about why not just use GMM to implement TP-GMM. So I asked the above questions.
Thanks for your answers in advance.
Thank you for your question and your support of our package! Apologize for my late reply. We are on vacation these days.
For the HMM issue, the relative explanation is provided by one of Dr. Calinon's papers - Learning Control
Whether we use GMM or HMM, the target for this part is to encode the demonstrations (trajectories). Compared with the GMM proposed in the original definition, HMM has extra latent variables changing over time. With this transition model, TP-GMM (with HMM encoding) is suitable for generating data from the learned encoding and can also handle missing data or partial sequences of various lengths.
For the third question, you can also find the explanation in the paper Learning Control Chapter 4.1.
The previous section (TP-GMM / TP-GMR) discussed the problem of generating and adapting reference trajectories, by assuming that a controller is available to track the retrieved reference. In this section (LQT / LQR), the problem is extended to that of directly estimating a controller ut for a discrete linear dynamical system.
Therefore, as shown in the pipeline figure of Rofunc, TP-GMM and TP-GMR are methods for learning the representation of demonstration, while LQT and LQR are the control methods which will provide control commands
Closed due to no reply for a long time