dhammack/DSB2017

what is branching?

dearkafka opened this issue · 2 comments

Could you elaborate what do you mean by "branching":

It was also found that ‘branching’ the model earlier produced better results when training with multiple objectives. If branching is done too late, the model outputs are too correlated (as they share too many parameters) and thus they provide overall less information in the next stage of the pipeline.

My understanding, that you are running a few iterations adding (?) blocks and checking whether the network started predicting, is it correct?

My understanding, that you are running a few iterations adding (?) blocks and checking whether the network started predicting, is it correct?

No. My neural networks had 4 outputs (malignancy, lobulation, diameter, spiculation). Think of two extremes when it comes to a multi output problem:

  • you could train one neural network per output
  • you could train one neural network for all the outputs together, ending with a layer like Dense(4)

I found neither of these to be ideal. What I found worked best was to share the lower layers in the network between each output but stop parameter sharing after a few layers. It's sort of a hybrid between the two extremes.

Take a look here for an example, starting around line 58. There is one conv layer per output, and the conv layers no longer interact any more. Before this all the conv layers were shared between outputs.

Thank you, Daniel, that was helpful.