keras-team/tf-keras

Cloning a TextVectorization Layer with Split Function Doesn't Work

Closed this issue · 6 comments

System information.
Please see the Google Colab notebook. The TensorFlow version is 2.15.0.

Describe the problem.
Cloning a TextVectorization layer with a split function results in
TypeError: Could not parse config: <function pipe_split_fn at 0x7d103d1f7640>

Describe the expected behavior.
It should successfully clone the TextVectorization layer.

Standalone code to reproduce the issue.
https://colab.research.google.com/drive/1tIS0Ynjck80kKGWv1tTQ6oCc8Nj1xnNa?usp=sharing

I originally reported the issue here.

Inspecting the code for TextVectorization (https://github.com/keras-team/keras/blob/master/keras/layers/preprocessing/text_vectorization.py#L491) and deserialize_keras_object (https://github.com/keras-team/keras/blob/master/keras/saving/serialization_lib.py#L392), I see that there is no way the proper logic for deserializing the split function will run. The deserialization code looks for if module_objects is not None:, but TextVectorization.from_config() doesn't pass a module_objects parameter to deserialize_keras_object, so that code block doesn't execute.

As a workaround, I extended the tf.keras.layers.TextVectorization class with:

class PatchedTextVectorization(tf.keras.layers.TextVectorization):

  @classmethod
  def from_config(cls, config):
    if not isinstance(config["standardize"], str):
      config["standardize"] = tf.keras.saving.deserialize_keras_object(config["standardize"])
    if not isinstance(config["split"], str):
      config["split"] = tf.keras.saving.deserialize_keras_object(config["split"], module_objects = [])

    return cls(**config)

Cloning an instance of PatchedTextVectorization constructed with the split function works fine. You can see I shoehorned module_objects = [] into its invocation of tf.keras.saving.deserialize_keras_object.

@sachinprasadhs,
I was able to reproduce the issue on tensorflow v2.14, v2.15. Kindly find the gist of it here.

@rlcauvin , From TensorFlow 2.16, Keras 3 will be the backend for tf.keras, I see this is working fine with Keras 3, that should fix your issue, is there any specific reason you're using tf.keras with 2.15?
You can use tensorflow 2.15 and Keras 3 as well.
install tensorflow first and then install keras 3.

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

Thank you, @sachinprasadhs. Using !pip install -U keras to install Keras 3 worked for that isolated case. However, I want to use tensorflow_decision_forests. Unfortunately, import tensorflow_decision_forests as tfdf results in the error below. Please see the modified Google Colab notebook for the full code.

[/usr/local/lib/python3.10/dist-packages/tensorflow_decision_forests/keras/core.py](https://localhost:8080/#) in <module>
     75   # tf>1.12
---> 76   import keras.src.engine.data_adapter as data_adapter
     77 except ImportError:

ModuleNotFoundError: No module named 'keras.src.engine'

During handling of the above exception, another exception occurred:

ModuleNotFoundError                       Traceback (most recent call last)
3 frames
[<ipython-input-3-b4486e63aff0>](https://localhost:8080/#) in <cell line: 1>()
----> 1 import tensorflow_decision_forests as tfdf
      2 import tensorflow as tf
      3 from typing import Text

[/usr/local/lib/python3.10/dist-packages/tensorflow_decision_forests/__init__.py](https://localhost:8080/#) in <module>
     62 check_version.check_version(__version__, compatible_tf_versions)
     63 
---> 64 from tensorflow_decision_forests import keras
     65 from tensorflow_decision_forests.component import py_tree
     66 from tensorflow_decision_forests.component.builder import builder

[/usr/local/lib/python3.10/dist-packages/tensorflow_decision_forests/keras/__init__.py](https://localhost:8080/#) in <module>
     51 from typing import Callable, List
     52 
---> 53 from tensorflow_decision_forests.keras import core
     54 from tensorflow_decision_forests.keras import wrappers
     55 

[/usr/local/lib/python3.10/dist-packages/tensorflow_decision_forests/keras/core.py](https://localhost:8080/#) in <module>
     77 except ImportError:
     78   # tf<=1.12
---> 79   import keras.engine.data_adapter as data_adapter
     80 get_data_handler = data_adapter.get_data_handler
     81 

ModuleNotFoundError: No module named 'keras.engine'

Fixed in this commit! Thank you, closing this issue.