google-research/deeplab2

Export model for kMaX-DeepLab fails

hannes09 opened this issue · 7 comments

I have tried to export kMaX-DeepLab via export_model.py and I run into the following error:

Traceback (most recent call last):
  File "deeplab2/export_model.py", line 157, in <module>
    app.run(main)
  File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "deeplab2/export_model.py", line 152, in main
    tf.saved_model.save(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/save.py", line 1290, in save
    save_and_return_nodes(obj, export_dir, signatures, options)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/save.py", line 1325, in save_and_return_nodes
    _build_meta_graph(obj, signatures, options, meta_graph_def))
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/save.py", line 1491, in _build_meta_graph
    return _build_meta_graph_impl(obj, signatures, options, meta_graph_def)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/save.py", line 1443, in _build_meta_graph_impl
    saveable_view = _SaveableView(augmented_graph_view, options)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/save.py", line 229, in __init__
    self.augmented_graph_view.objects_ids_and_slot_variables_and_paths())
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/training/tracking/graph_view.py", line 544, in objects_ids_and_slot_variables_and_paths
    trackable_objects, node_paths = self._breadth_first_traversal()
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/training/tracking/graph_view.py", line 255, in _breadth_first_traversal
    for name, dependency in self.list_children(current_trackable):
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/save.py", line 143, in list_children
    for name, child in super(_AugmentedGraphView, self).list_children(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/training/tracking/graph_view.py", line 203, in list_children
    in obj._trackable_children(save_type, **kwargs).items()]
  File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 3201, in _trackable_children
    children = super(Model, self)._trackable_children(save_type, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/keras/engine/base_layer.py", line 3174, in _trackable_children
    children = self._trackable_saved_model_saver.trackable_children(cache)
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/base_serialization.py", line 59, in trackable_children
    children = self.objects_to_serialize(serialization_cache)
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/layer_serialization.py", line 68, in objects_to_serialize
    return (self._get_serialized_attributes(
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/layer_serialization.py", line 88, in _get_serialized_attributes
    object_dict, function_dict = self._get_serialized_attributes_internal(
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/model_serialization.py", line 56, in _get_serialized_attributes_internal
    super(ModelSavedModelSaver, self)._get_serialized_attributes_internal(
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/layer_serialization.py", line 98, in _get_serialized_attributes_internal
    functions = save_impl.wrap_layer_functions(self.obj, serialization_cache)
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/save_impl.py", line 149, in wrap_layer_functions
    original_fns = _replace_child_layer_functions(layer, serialization_cache)
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/save_impl.py", line 276, in _replace_child_layer_functions
    child_layer._trackable_saved_model_saver._get_serialized_attributes(
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/layer_serialization.py", line 88, in _get_serialized_attributes
    object_dict, function_dict = self._get_serialized_attributes_internal(
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/model_serialization.py", line 56, in _get_serialized_attributes_internal
    super(ModelSavedModelSaver, self)._get_serialized_attributes_internal(
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/layer_serialization.py", line 98, in _get_serialized_attributes_internal
    functions = save_impl.wrap_layer_functions(self.obj, serialization_cache)
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/save_impl.py", line 197, in wrap_layer_functions
    fn.get_concrete_function()
  File "/usr/lib/python3.8/contextlib.py", line 120, in __exit__
    next(self.gen)
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/save_impl.py", line 359, in tracing_scope
    fn.get_concrete_function(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/def_function.py", line 1239, in get_concrete_function
    concrete = self._get_concrete_function_garbage_collected(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/def_function.py", line 1230, in _get_concrete_function_garbage_collected
    concrete = self._stateful_fn._get_concrete_function_garbage_collected(  # pylint: disable=protected-access
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py", line 2533, in _get_concrete_function_garbage_collected
    graph_function, _ = self._maybe_define_function(args, kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py", line 2711, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py", line 2627, in _create_graph_function
    func_graph_module.func_graph_from_py_func(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/func_graph.py", line 1141, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/def_function.py", line 677, in wrapped_fn
    out = weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/save_impl.py", line 572, in wrapper
    ret = method(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/utils.py", line 168, in wrap_with_training_arg
    return control_flow_util.smart_cond(
  File "/usr/local/lib/python3.8/dist-packages/keras/utils/control_flow_util.py", line 105, in smart_cond
    return tf.__internal__.smart_cond.smart_cond(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/smart_cond.py", line 53, in smart_cond
    return true_fn()
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/utils.py", line 169, in <lambda>
    training, lambda: replace_training_and_call(True),
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/utils.py", line 166, in replace_training_and_call
    return wrapped_call(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/save_impl.py", line 652, in call
    return call_and_return_conditional_losses(inputs, *args, **kwargs)[0]
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/save_impl.py", line 610, in __call__
    return self.wrapped_call(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/save_impl.py", line 572, in wrapper
    ret = method(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/utils.py", line 168, in wrap_with_training_arg
    return control_flow_util.smart_cond(
  File "/usr/local/lib/python3.8/dist-packages/keras/utils/control_flow_util.py", line 105, in smart_cond
    return tf.__internal__.smart_cond.smart_cond(
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/utils.py", line 169, in <lambda>
    training, lambda: replace_training_and_call(True),
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/utils.py", line 166, in replace_training_and_call
    return wrapped_call(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/keras/saving/saved_model/save_impl.py", line 634, in call_and_return_conditional_losses
    call_output = layer_call(*args, **kwargs)
  File "/panoptic_segmentation/deeplab2/model/layers/axial_block_groups.py", line 431, in call
    pixel_space_drop_path_mask = drop_path.generate_drop_path_random_mask(
  File "/panoptic_segmentation/deeplab2/model/layers/drop_path.py", line 78, in generate_drop_path_random_mask
    random_tensor += tf.random.uniform(
TypeError: Failed to convert elements of (None, 1, 1) to Tensor. Consider casting elements to a supported type. See https://www.tensorflow.org/api_docs/python/tf/dtypes for supported TF dtypes.

I am using kmax_meta_r50_os32.textproto.

Hi,

Thanks for asking. Could you please try set drop-path here to 1.0? (No worry it will not affect the inference result), and see if it helps resolve the issue?

Best,

Yes, chaning drop-path to 1.0 fixes the export error.
Can I also set the drop-path to 1.0 during training/validation?

Were you able to get reasonable results from the export?
Were you bombarded with with
"WARNING:absl:Importing a function (__inference_internal_grad_fn_355665) with ops with unsaved custom gradients. Will likely fail if a gradient is requested." messages?
I trained a kmax model as well and I keep getting hundreds of those messages and then the inference results does not look the same as the eval results during training.

Thanks for letting us know! Setting drop_path_keep_prob to 1.0 will not affect the inference performance (e.g., during the validation or exported model), so it is fine for your trained model. But I would suggest keeping the original setting for training. We will look into the issue when exporting models with drop_path_keep_prob < 1.0 and fix that soon.

Were you able to get reasonable results from the export? Were you bombarded with with "WARNING:absl:Importing a function (__inference_internal_grad_fn_355665) with ops with unsaved custom gradients. Will likely fail if a gradient is requested." messages? I trained a kmax model as well and I keep getting hundreds of those messages and then the inference results does not look the same as the eval results during training.

Were you able to get reasonable results from the export? Were you bombarded with with "WARNING:absl:Importing a function (__inference_internal_grad_fn_355665) with ops with unsaved custom gradients. Will likely fail if a gradient is requested." messages? I trained a kmax model as well and I keep getting hundreds of those messages and then the inference results does not look the same as the eval results during training.

In our experiments, the exported model can produce the same results as the validation results. Please check if you are using the exact same input images. The warning you pasted looks fine to me, as it complains about "gradients" which are not needed for inference. Thanks.

Closing the issue, as there is no activity for a while.
We hope your issue has been resolved.
If not, please feel free to open a new one.

getting same error even when drop_path is set to 1.0
def init(self,
name: str,
auxiliary_predictor_func: Optional[Callable[[], tf.keras.Model]],
norm_layer: Optional[Callable[
[],
tf.keras.layers.Layer]] = tf.keras.layers.BatchNormalization,
num_blocks: Tuple[int, int, int] = (2, 2, 2),
num_mask_slots: int = 128,
transformer_decoder_drop_path_keep_prob: float = 1.0):

I'm I naking any other mistake here??
Thanks in advance.