make_inspector() throws object of type 'NoneType' has no len() when I retrieve TF DF RF model layer in the hybrid model
Geerthy1130 opened this issue · 3 comments
I am developing the hybrid model using the time series and tabular data on CNN and TF DF random forest model on WSL + ubuntu 20.04. I also used a Keras tuner on the hybrid model to get the best hyperparameters.
import tensorflow_decision_forests as tfdf
import tensorflow as tf
import tf_keras
Define fixed hyperparameters
fixed_hyperparameters = {
'rf_num_trees': 100,
'rf_max_depth': 6,
'min_examples': 10,
}
This is the model:
Build the combined model
cnn_input = tf.keras.Input(shape=(12, 301, 1))
rf_input = tf.keras.Input(shape=(60,))
cnn_output = tf.keras.layers.Conv2D(filters=16, kernel_size=(12, 125), activation='relu')(cnn_input)
cnn_output = tf.keras.layers.Conv2D(filters=32, kernel_size=(1, 40), activation='relu')(cnn_output)
cnn_output = tf.keras.layers.Flatten()(cnn_output)
rf_output = tfdf.keras.RandomForestModel(
num_trees=fixed_hyperparameters['rf_num_trees'],
max_depth=fixed_hyperparameters['rf_max_depth'],
min_examples=fixed_hyperparameters['min_examples']
)(rf_input)
combined_output = tf.keras.layers.concatenate([cnn_output, rf_output])
fc_output = tf.keras.layers.Dense(32, activation='relu')(combined_output)
output = tf.keras.layers.Dense(1, activation='relu')(fc_output)
model = tf.keras.Model(inputs=[cnn_input, rf_input], outputs=output)
Here's my model summary :
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 12, 301, 1)] 0 []
conv2d (Conv2D) (None, 1, 177, 16) 24016 ['input_1[0][0]']
conv2d_1 (Conv2D) (None, 1, 138, 32) 20512 ['conv2d[0][0]']
input_2 (InputLayer) [(None, 60)] 0 []
flatten (Flatten) (None, 4416) 0 ['conv2d_1[0][0]']
random_forest_model (Rando (None, 1) 1 ['input_2[0][0]']
mForestModel)
concatenate (Concatenate) (None, 4417) 0 ['flatten[0][0]',
'random_forest_model[0][0]']
dense (Dense) (None, 32) 141376 ['concatenate[0][0]']
dense_1 (Dense) (None, 1) 33 ['dense[0][0]']
==================================================================================================
Total params: 185938 (726.32 KB)
Trainable params: 185937 (726.32 KB)
Non-trainable params: 1 (1.00 Byte)
I would like to retrieve the feature importance from the RF model for the tabular data alone. When I try the following steps, I get an error.
Calculate feature importance for the random forest model using the test data
rf_model_layer = model.layers[5] # Assuming the random forest model is the 5th layer
inspector = rf_model_layer.make_inspector()
feature_importances = inspector.variable_importances(test_data=dataTestRF)
Error :
Traceback (most recent call last):
Cell In[6], line 1
inspector = rf_model_layer.make_inspector()
File ~/anaconda3/lib/python3.11/site-packages/tensorflow_decision_forests/keras/core_inference.py:411 in make_inspector
path = self.yggdrasil_model_path_tensor().numpy().decode("utf-8")
File ~/anaconda3/lib/python3.11/site-packages/tensorflow/python/util/traceback_utils.py:153 in error_handler
raise e.with_traceback(filtered_tb) from None
File /tmp/autograph_generated_fileriaxrjs2.py:38 in tf__yggdrasil_model_path_tensor
ag.if_stmt(ag__.ld(multitask_model_index) >= ag__.converted_call(ag__.ld(len), (ag__.ld(self)._models,), None, fscope), if_body, else_body, get_state, set_state, (), 0)
TypeError: in user code:
File "/home/hybrid/anaconda3/lib/python3.11/site-packages/tensorflow_decision_forests/keras/core_inference.py", line 436, in yggdrasil_model_path_tensor *
if multitask_model_index >= len(self._models):
TypeError: object of type 'NoneType' has no len()
Any suggestions would be appreciated.
Thank you.
Hi, thank you for reporting this.
The error message given by TF-DF is unfortunately not very clear, but I believe that the Random Forest model has not been trained. Note that Random Forests cannot be trained with backpropagation the way Neural Networks are trained. As a consequence, the Random Forest model must be trained separately. Our Model composition tutorial shows in detail how this is done with a few more lines of code.
One more thing: Variable importances in TF-DF arestructural variable importances, i.e. based on the model structure alone (e.g. number of times a variable is root). They are not computed with a test dataset, so the argument to inspector.variable_importances(test_data=dataTestRF)
is not valid. Advanced variable importance computation is possible with YDF. In your case, you would have to save the TF-DF model, import it in YDF and run the analysis as explained in this tutorial. Let me know if you want to know more about this.
Thank you very much for your prompt response and suggestions.
I try to find the best hyperparameters on the combined model (that uses different input data on CNN and RF) using the Keras tuner. Here's the sample code:
class CombinedModel(HyperModel):
def __init__(self, cnn_input_shape, rf_input_shape):
self.cnn_input_shape = cnn_input_shape
self.rf_input_shape = rf_input_shape
def build(self, hp):
# CNN part
cnn_input = Input(shape=self.cnn_input_shape)
RawInputECG = Input(shape=(12,301,1))
cnn_output = tf.keras.layers.Conv2D(filters=16, kernel_size=(12, 125), activation='relu')(cnn_input)
cnn_output = tf.keras.layers.Conv2D(filters=32, kernel_size=(1, 40), activation='relu')(cnn_output)
cnn_output = tf.keras.layers.Flatten()(cnn_output) # here the parameters are fixed to test the model.
# RF part
rf_output = tfdf.keras.RandomForestModel(
num_trees=fixed_hyperparameters['rf_num_trees'],
max_depth=fixed_hyperparameters['rf_max_depth'],
min_examples=fixed_hyperparameters['min_examples']
)(rf_input)
# Combine CNN and RF outputs
combined_layer = concatenate([cnn_output, rf_output])
# Fully Connected layer
fc_activation = hp.Choice('fc_activation', values=['relu', 'sigmoid'])
fc_layer = Dense(32, activation=fc_activation)(combined_layer)
# Output layer
output_layer = Dense(1, activation='relu')(fc_layer)
model = Model(inputs=[cnn_input, rf_input], outputs=output_layer)
model.compile(optimizer=optimizer, loss='mse', metrics=[tf.keras.metrics.RootMeanSquaredError(), mae_error])
return model
if __name__ == '__main__':
cnn_input_shape = (12, 301, 1)
rf_input_shape = ( 60,)
combined_model = CombinedModel(cnn_input_shape, rf_input_shape)
# loading the data for CNN and RF
# loading the tuner
tuner_bo = RandomSearch(
combined_model,
objective=keras_tuner.Objective("val_loss", direction="min"),
max_trials=50,
seed=16,
executions_per_trial=1,
overwrite=False,
project_name="Hybrid_model")
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=20)
mc = ModelCheckpoint(saveFN, monitor='val_loss', mode='min', verbose=1, save_best_only=True)
tuner_bo.search([dataTrain,dataTrainRF], labelsTrain,
validation_data = ([dataVal, dataValRF], labelsVal))
I understood, that in the above code RF model is not trained, only the CNN part is trained. Can I combine the Keras tuner and tfdf tuner to find the best hyperparameters for the combined model rather than training separately? Also, they stack both neural networks and RF models together in the model composition tutorial link. Is it possible to do the tuner search in the stacked model to find the best hyperparameters for CNN and RF together?
Does the YDF library run both on Windows and WSL + Ubuntu?
Any suggestions would be appreciated.
Thank you very much
I don't think the Keras tuner or the TF-DF tuner support tuning CNN and RF at the same time - @achoum can you please confirm?
YDF runs on Windows and Ubuntu - I haven' tried on WSL but I do not expect any issues, please report them if you find any. We provide Pip packages for Windows, Linux and Mac (Arm CPU).