DAE References and Performance

Question

DAE References and Performance

ppstacy opened this issue 4 years ago · 2 comments

Hi @jeongyoonlee, I saw you added the DAE in the recent release! I didn't find a lot of references for DAE, so wondering if you could share a bit more? Additionally for the probability to add swap noise to features, how do we decide the probability to use here? Is there any rule of thumb to follow?

I assume DAE will perform better on certain datasets with noise in the features, so by any chance do you some examples to share and potentially comparing the performance with other feature engineering methods we have in the pacakge?

Thanks a lot!!

Answer 1 · 2021-04-29T20:50:27.000Z

Hi @ppstacy, you can find the comprehensive study on various types of autoencoder in Vincent et al. 2010. It compares the swap noise, zero noise, and gaussian noise as well.

DAE works well when you have test features in advance because you can extract good representation of input features including test data, then use them to train models with training data. There were 3 tabular data competitions won by approaches mainly using DAE at Kaggle.

Hope it help.

Answer 2 · 2021-04-30T23:06:53.000Z

Thanks for sharing Jeong! I will look into those, sounds very promising approach!