ragulpr/wtte-rnn

Problems at replicating the CMAPPS data score

Opened this issue · 5 comments

Dear Egil,

First, I would like to thank you for your great work! I am in the process of implementing your approach in a distributed collaborative learning platform for turbine prognostics and I was trying first to replicate your results.

I have noticed that in the uploaded custom activation function alpha is limited to a minimum of init_alpha/e, if an {-1,1} bounded activation is used in the last Dense(2) layer.

I figured out this could be overcome by simply using a linear activation function in the last layer of the neural network, and it indeed worked. Unfortunately, the algorithm was very unstable and the loss went to nan after 400-500 epochs.

This, I thought, was not a problem and I simply changed the custom activation function so alpha could still reach 0. It works fine but I am still having a lot of trouble replicating your results... The best score I have obtained is around 800.

I must confess that I do not have a great deal of computational power at my disposal so I am computing for around 1500 epochs.

I write this post just to ask if you also encountered this problem with the loss going to nan when using linear activation in the last layer of the neural network (before the custom layer).

This instability has become specially relevant when using real industrial data, where I am forced to use 'tanh' activation before the custom activation layer.

Yours

ps. Another slightly strange issue is that when I split the dataset you mention in your thesis, I obtain a different number of trajectories using the same split... I assume this is a typo?

I encountered the same issue that loss went to nan around 36 epochs when running the simple_example notebook. I think the problem is in the loss function, which may has a dividend-by-zero or npe problem.

Dear Hang,

I also suspected that, but I failed at locating any /0. I am a bit new in python so I am not very intuitive at finding npe's. Let me know if you find where the problem is!

Yours

@as2636 , I could finally reproduce the results by changing the max_beta_value to 3.0 in these lines:

model.add(Lambda(wtte.output_lambda, 
                 arguments={"init_alpha":init_alpha, 
                            "max_beta_value":3.0}))

You can try it out.

Regards.

Hi Hang,

I wonder what difference does it make though. Thanks for the tip anyway!

@as2636 Hi, It would be great to share what you did as an example. I'm also trying to replicate the result. But it has been pretty bad.