keras-team/keras

Potential bug in legacy h5 weights loading.

torzdf opened this issue · 2 comments

This may be working as intended, in which case feel free to close this issue.

There is a difference in top_level_model_weights loading between the functions load_weights_from_hdf5_group_by_name and load_weights_from_hdf5_group_by

Specifically, the load_weights_from_hdf5_group_by_name collects symbolic weights using the following code:

symbolic_weights = model.trainable_weights + model.non_trainable_weights

symbolic_weights = model.trainable_weights + model.non_trainable_weights

In my usecase this returns 200 weights

Meanwhile, load_weights_from_hdf5_group collects symbolic weights using the following code:

        symbolic_weights = list(
            # model.weights
            v
            for v in model._trainable_variables + model._non_trainable_variables
            if v in model.weights
        )

symbolic_weights = list(

In my usecase this returns 0 weights (in fact model._trainable_variables + model._non_trainable_variables returns 0 weights even without filtering for variables within model.weights

I would expect both of these loading methodologies to return the same number of symbolic weights, but this is clearly not the case

I suspect that by_name is incorrect whilst the alternative is correct, but cannot be sure.

Hi @torzdf ,

Thanks for reporting. It would be appreciated if you can submit a code snippet for reproducing the error.

Hi @torzdf ,

Thanks for reporting. It would be appreciated if you can submit a code snippet for reproducing the error.

Not straightforward, as I discovered this by injecting print statements at the above linked locations.

Either way, I have demonstrated where the code diverges, The difference is clear to see from the different variables that are used to collect the symbolic weights.

This was discovered by loading a model from Keras 2 into Keras 3 which adds further challenges providing reproducible code. I do not know if this issue exists for legacy models generated in Keras 3 as I resolved the issue my end by not using the by_name variant.

However, I raised this issue as I thought it may be a bug. so thought you may wish to be aware. It also may not be a bug.

At this point, what you do with this information is up to you.