Ideas for improving Petridish
christofer-f opened this issue · 1 comments
Hi...
I have an idea that I would like to hear your thoughts about.
In the petridish algorithm, you gradually grow a neural network according to some objective.
What if you combined petridish with ideas from this paper:
https://arxiv.org/abs/2006.04647
https://github.com/BayesWatch/nas-without-training
So instead of incremental steps, you did a Monte Carlo Tree Search using ideas from above to find suitable candidates in the search.
So you alternate between two modes:
-
One pass you add several growing steps at once.
(Basically traversing from the root node to the candidate found in the MCTS search.) -
You do the MCTS using a search without training.
This is at least something I would like to try out...
//Christofer
@christofer-f that is a great idea and something to try out on our plate for a while. A catch though is that this paper is basically using NTK ideas to come up with a scoring function which is predictive of the final performance of the network. It is well-known now that the kernel regime cannot model asymptotic performance well but can approximate network at initialization well. And if you look at their correlation plots for small ImageNet it is noisy.
Nevertheless it may be worth trying out.
Do you need the MCTS search though? The version I was thinking is we do everything exactly as we are doing now in Petridish except in the step when we evaluate which candidate is the best by training 80 epochs with L1 regularization over the candidate layer's architecture weights, we use the scoring function proposed in this paper to rank the parent+candidate networks and pick the best scoring one, add to the parent. Rinse and repeat?