Test fails to restore from pretrained model checkpoint linked in README
jpgard opened this issue · 0 comments
Thanks for the great repo @taki0112 .
I am trying to use the pretrained model located here and linked in the README. I'm able to instantiate the model with .build_model()
, and can see the expected output when I run show_all_variables()
, but when I attempt to run .test()
on the model, it fails to restore from the provided checkpoint.
It seems that there is a mismatch between the variables in the checkpoint, and in the model. I think it is possible that the hyperparameters or architecture I am using (I am using all of the default values from train.py
may not match those for which the model was trained, or variables have been renamed in the model saved in the checkpoint vs. the model instantiated by the github code? There is no information about the pretrained model itself, so I do not know.
Here is the output of my model, printed by the StyleGAN class on instantiation:
##### Information #####
# dataset : FFHQ
# dataset number : 99
# gpu : 1
# batch_size in train phase : OrderedDict([(4, 128), (8, 128), (16, 128), (32, 64), (64, 32), (128, 16), (256, 8), (512, 4), (1024, 4)])
# batch_size in test phase : 1
# start resolution : 8
# target resolution : 1024
# iteration per resolution : 1200000
# progressive training : True
# spectral normalization : False
Partial stack trace is shown below -- thanks for any suggestions you could provide.
gan.test()
[*] Reading checkpoints...
INFO:tensorflow:Restoring parameters from /jpgard/StyleGAN-Tensorflow/checkpoint/StyleGAN_FFHQ_8to1024_progressive/StyleGAN.model-224999
---------------------------------------------------------------------------
NotFoundError Traceback (most recent call last)
~/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/client/session.py in _do_call(self, fn, *args)
1364 try:
-> 1365 return fn(*args)
1366 except errors.OpError as e:
~/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
1349 return self._call_tf_sessionrun(options, feed_dict, fetch_list,
-> 1350 target_list, run_metadata)
1351
~/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/client/session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
1442 fetch_list, target_list,
-> 1443 run_metadata)
1444
NotFoundError: 2 root error(s) found.
(0) Not found: Key D/1024x1024/Conv0/bias not found in checkpoint
[[{{node save/RestoreV2}}]]
(1) Not found: Key D/1024x1024/Conv0/bias not found in checkpoint
[[{{node save/RestoreV2}}]]
[[save/RestoreV2/_3385]]
0 successful operations.
0 derived errors ignored.
...
During handling of the above exception, another exception occurred:
NotFoundError Traceback (most recent call last)
~/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/training/saver.py in restore(self, sess, save_path)
1299 try:
-> 1300 names_to_keys = object_graph_key_mapping(save_path)
1301 except errors.NotFoundError:
~/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/training/saver.py in object_graph_key_mapping(checkpoint_path)
1617 reader = pywrap_tensorflow.NewCheckpointReader(checkpoint_path)
-> 1618 object_graph_string = reader.get_tensor(trackable.OBJECT_GRAPH_PROTO_KEY)
1619 object_graph_proto = (trackable_object_graph_pb2.TrackableObjectGraph())
~/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/pywrap_tensorflow_internal.py in get_tensor(self, tensor_str)
914
--> 915 return CheckpointReader_GetTensor(self, compat.as_bytes(tensor_str))
916
NotFoundError: Key _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint
During handling of the above exception, another exception occurred:
NotFoundError Traceback (most recent call last)
<ipython-input-16-822286d4027f> in <module>
1 # os.listdir(os.path.join(gan.checkpoint_dir, gan.model_dir))
----> 2 gan.test()
~/StyleGAN-Tensorflow/stylegan_tf/StyleGAN.py in test(self)
554
555 self.saver = tf.train.Saver()
--> 556 could_load, checkpoint_counter = self.load(self.checkpoint_dir)
557 result_dir = os.path.join(self.result_dir, self.model_dir)
558 check_folder(result_dir)
~/StyleGAN-Tensorflow/stylegan_tf/StyleGAN.py in load(self, checkpoint_dir)
542 if ckpt and ckpt.model_checkpoint_path:
543 ckpt_name = os.path.basename(ckpt.model_checkpoint_path)
--> 544 self.saver.restore(self.sess, os.path.join(checkpoint_dir, ckpt_name))
545 counter = int(ckpt_name.split('-')[-1])
546 print(" [*] Success to read {}".format(ckpt_name))
~/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/training/saver.py in restore(self, sess, save_path)
1304 # a helpful message (b/110263146)
1305 raise _wrap_restore_error_with_msg(
-> 1306 err, "a Variable name or other graph key that is missing")
1307
1308 # This is an object-based checkpoint. We'll print a warning and then do
NotFoundError: Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
2 root error(s) found.
(0) Not found: Key D/1024x1024/Conv0/bias not found in checkpoint
[[node save/RestoreV2 (defined at /homes/gws/jpgard/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
(1) Not found: Key D/1024x1024/Conv0/bias not found in checkpoint
[[node save/RestoreV2 (defined at /homes/gws/jpgard/stylegan-demo/stylegan-venv/lib64/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
[[save/RestoreV2/_3385]]
0 successful operations.
0 derived errors ignored.