Got graph disconnected error when using a pre-trained sequential model.

Question

Got graph disconnected error when using a pre-trained sequential model.

tohnperfect opened this issue 4 years ago · 9 comments

I have a trained sequential model which composes of a pre-trained headless efficient net and the final layers. The model.summary() look as follows,

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
efficientnet-b3 (Model)      (None, 5, 5, 1536)        10783528  
_________________________________________________________________
gap (GlobalMaxPooling2D)     (None, 1536)              0         
_________________________________________________________________
dropout_out (Dropout)        (None, 1536)              0         
_________________________________________________________________
fc_out (Dense)               (None, 1)                 1537      
=================================================================
Total params: 10,785,065
Trainable params: 1,479,937
Non-trainable params: 9,305,128
_________________________________________________________________

My efficientnet-b3 model looks like,

Model: "efficientnet-b3"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 150, 150, 3) 0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 75, 75, 40)   1080        input_1[0][0]                    
__________________________________________________________________________________________________   
.
.
.
__________________________________________________________________________________________________  
add_18 (Add)                    (None, 5, 5, 384)    0           drop_connect_18[0][0]            
                                                                 batch_normalization_73[0][0]     
__________________________________________________________________________________________________
conv2d_103 (Conv2D)             (None, 5, 5, 1536)   589824      add_18[0][0]                     
__________________________________________________________________________________________________
batch_normalization_77 (BatchNo (None, 5, 5, 1536)   6144        conv2d_103[0][0]                 
__________________________________________________________________________________________________
swish_77 (Swish)                (None, 5, 5, 1536)   0           batch_normalization_77[0][0]     
==================================================================================================
Total params: 10,783,528
Trainable params: 1,478,400
Non-trainable params: 9,305,128
__________________________________________________________________________________________________

I tried to use core API GradCAM for the trained model as follows,

from tf_explain.core.grad_cam import GradCAM
explainer = GradCAM()
img = load_image(img_path) #tf image
data = ([img], None)
explainer.save(grid, ".", "grad_cam.png")

which output this error,

ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(None, 150, 150, 3), dtype=float32) at layer "input_1". The following previous layers were accessed without issue: []

Please note that the model prediction works fine.

Thank you for your help!

Answer 1 · 2020-06-04T12:49:59.000Z

@tohnperfect: I faced the same problem a few days back. Try this and this worked very well in my case:

Change it to the functional model

x = pre_trained_model.output  #(in your case pre_trained model is efficientnet-b3)
global_average_layer = GlobalAveragePooling2D()(x)
dropout_layer_1 = Dropout(0.50)(global_average_layer)
prediction_layer = Dense(1, activation='sigmoid')(dropout_layer_1)

model = Model(inputs= pre_trained_model.input, outputs=prediction_layer) 
model.summary()

Now, train the model and use it for GradCAM. I hope it works for you too. Please do let us know if you find any other solution i.e. a way around for Sequential models.

Answer 2 · 2020-06-04T13:42:11.000Z

Thank a lot @rao208

This means creating a model in a functional way before training and train it again, right?
I will try that and updated here.

I still wonder if the sequential model can be used with TF-explain because I have several trained sequential models to be explored with GradCAM.

Answer 3 · 2020-06-04T14:28:40.000Z

@tohnperfect

This means creating a model in a functional way before training and train it again, right?

Yes, this means creating the classifier part i.e. global average pooling layer and dense layer (as written in my reply above) in a functional way (before training) and train your entire model again. I used VGG16 as my pre-trained model with include_top=False. It is okay when the pre-trained model is a sequential model.

I still wonder if the sequential model can be used with TF-explain because I have several trained sequential models to be explored with GradCAM.

Well, you can use the sequential models with GradCAM, but the problem here is you are using a pre-trained network without the classifier. When you try to connect this pre-trained network with your classifier, the model is viewed as two separate graphs. That is why you are getting this error.

When you build a sequential model from scratch, you won't get 'graph disconnected error'. I have used many sequential models that are built from scratch and it works well with GradCAM (tf_explain)

Honestly, I don't think it will make any difference. The way I understood it is, the functional model is used for more complex architecture for example when you have skip connections like in Resnet and Sequential models are used when you have layers after layers i.e. for simpler architecture. If there is any major difference, then I am not aware of that.

Answer 4 · 2020-06-04T15:07:26.000Z

I got it. Thank! @rao208

Answer 5 · 2020-08-13T08:21:08.000Z

Thanks @rao208 @tohnperfect

However, can not figure out how to add additional input layers(ex: aug, pre) on bottom of the model that workable with GradCAM().

Here is not fully workable example just for reference.

# MobileNetV2
num_classes = 5

inputs = tf.keras.Input(shape=(224, 224, 3))
aug = data_augmentation(inputs)
pre = preprocess_input(aug)

bm_output = base_model(pre, training=False)

gap2d = tf.keras.layers.GlobalAveragePooling2D()(bm_output) 
dro = tf.keras.layers.Dropout(0.2)(gap2d)
outputs = tf.keras.layers.Dense(num_classes)(dro)

model = tf.keras.Model(inputs, outputs, name='model-re-mbnetv2')

This new model was added few bottom layer to base_mode, it can be trained and inference well. But will face the
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_97_2:0", shape=(None, 224, 224, 3), dtype=float32) at layer "input_97". The following previous layers were accessed without issue: [] issue when applied GradCAM().

Model: "model-re-mbnetv2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_98 (InputLayer)        [(None, 224, 224, 3)]     0         
_________________________________________________________________
sequential_2 (Sequential)    (None, 224, 224, 3)       0         
_________________________________________________________________
tf_op_layer_RealDiv_17 (Tens (None, 224, 224, 3)       0         
_________________________________________________________________
tf_op_layer_Sub_17 (TensorFl (None, 224, 224, 3)       0         
_________________________________________________________________
mobilenetv2_1.00_224 (Model) (None, 7, 7, 1280)        2257984   
_________________________________________________________________
global_average_pooling2d_39  (None, 1280)              0         
_________________________________________________________________
dropout_15 (Dropout)         (None, 1280)              0         
_________________________________________________________________
dense_48 (Dense)             (None, 5)                 6405      
=================================================================
Total params: 2,264,389
Trainable params: 6,405
Non-trainable params: 2,257,984
_________________________________________________________________

# ResNet50

num_classes = 5

bm_output = base_model(inputs, training=False)
gap2d = tf.keras.layers.GlobalAveragePooling2D()(base_model.output)
outputs = tf.keras.layers.Dense(num_classes)(gap2d)

model = tf.keras.Model(base_model.input, outputs, name='model-re-resnet50')

This Resnet50 is fine to train/inference and the CAM() because its not to include additional input layers.


Model: "model-re-resnet50"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_85 (InputLayer)           [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 230, 230, 3)  0           input_85[0][0]                   
__________________________________________________________________________________________________
conv1_conv (Conv2D)             (None, 112, 112, 64) 9472        conv1_pad[0][0]                  
__________________________________________________________________________________________________
conv1_bn (BatchNormalization)   (None, 112, 112, 64) 256         conv1_conv[0][0]                 
__________________________________________________________________________________________________
conv1_relu (Activation)         (None, 112, 112, 64) 0           conv1_bn[0][0]                   
__________________________________________________________________________________________________
pool1_pad (ZeroPadding2D)       (None, 114, 114, 64) 0           conv1_relu[0][0]                 
__________________________________________________________________________________________________
pool1_pool (MaxPooling2D)       (None, 56, 56, 64)   0           pool1_pad[0][0]                  
__________________________________________________________________________________________________  
...
...
...
conv5_block3_3_conv (Conv2D)    (None, 7, 7, 2048)   1050624     conv5_block3_2_relu[0][0]        
__________________________________________________________________________________________________
conv5_block3_3_bn (BatchNormali (None, 7, 7, 2048)   8192        conv5_block3_3_conv[0][0]        
__________________________________________________________________________________________________
conv5_block3_add (Add)          (None, 7, 7, 2048)   0           conv5_block2_out[0][0]           
                                                                 conv5_block3_3_bn[0][0]          
__________________________________________________________________________________________________
conv5_block3_out (Activation)   (None, 7, 7, 2048)   0           conv5_block3_add[0][0]           
__________________________________________________________________________________________________
global_average_pooling2d_26 (Gl (None, 2048)         0           conv5_block3_out[0][0]           
__________________________________________________________________________________________________
dense_35 (Dense)                (None, 5)            10245       global_average_pooling2d_26[0][0]
==================================================================================================
Total params: 23,597,957
Trainable params: 10,245
Non-trainable params: 23,587,712

Answer 6 · 2020-08-13T11:36:11.000Z

@vscv There are some techniques online to add the additional input layers to the pretrained model (example: https://stackoverflow.com/questions/59695637/i-am-trying-to-merge-2-pretrained-keras-model-but-failed or https://stackoverflow.com/questions/40755914/prepending-downsample-layer-to-resnet50-pretrained-model)

I tried following as suggested in the answer on Stackoverflow. If your end goal is to implement GradCam on layers of the pretrained model, then you cannot do so. If you see the output in

https://stackoverflow.com/questions/40755914/prepending-downsample-layer-to-resnet50-pretrained-model

you will see that one cannot extract the layers of pretrained models and hence GradCAM cannot be implemented on the new model.

If you find any solution, please post it here. I tried looking everywhere online, but could not figure it out how to add additional input layer and extract the layers of pretrained model.

Hope this answers your question :)

Answer 7 · 2020-08-14T01:22:06.000Z

@rao208 Thanks for the hit. I revised the statement of problem.

Answer 8 · 2020-12-05T14:37:39.000Z

@rao208 @vscv did you find any solution?

Answer 9 · 2020-12-07T05:41:15.000Z

@rao208 @vscv did you find any solution?

The problem has not been resolved.
My current alternative is to put preprocess and augmentation into tf.data.map instead of adding it to the base_model.