secsilm/understaing-datasets-estimators-tfrecords

请问estimator的模型训练好之后,怎么使用呢,能否提供一个demo作为参考,谢谢

WenmuZhou opened this issue · 11 comments

请问estimator的模型训练好之后,怎么使用呢,能否提供一个demo作为参考,谢谢

@WenmuZhou 我写了简单的 demo,同时也更新了 cifar10_estimator_dataset.py。说明放在了 README.md 里,你可以在这里查看。

我训练之后,使用这个脚本预测,报了这个错误

Traceback (most recent call last):
  File "/data/zj/understaing-datasets-estimators-tfrecords/cifar10_estimator_dataset_predict.py", line 58, in <module>
    tf.app.run(main=infer)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "/data/zj/understaing-datasets-estimators-tfrecords/cifar10_estimator_dataset_predict.py", line 32, in infer
    for r in result:
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 409, in predict
    input_fn, model_fn_lib.ModeKeys.PREDICT)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 567, in _get_features_from_input_fn
    result = self._call_input_fn(input_fn, mode)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 663, in _call_input_fn
    return input_fn(**kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/inputs/numpy_io.py", line 99, in input_fn
    raise TypeError('x must be dict; got {}'.format(type(x).__name__))
TypeError: x must be dict; got ndarray

使用下面的predict_input_fn函数作为输入就成功了,但是好像如果不print(result)的话,就不会执行预测,这是为啥啊

@WenmuZhou 你的 TensorFlow 版本是?不 print 就不执行预测,你是怎么看出来的呢?

@secsilm 我加了个显示时间的

start = time.time()
result = estimator.predict(input_fn=predict_input_fn)
for r in result:
    print(str(r))
print((time.time() - start) / 10)

不打印输出时候的日志

INFO:tensorflow:Using config: {'_model_dir': 'models/cifar10_cnn_model', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': gpu_options {
  per_process_gpu_memory_fraction: 0.4
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f960c88f940>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
1.7.0
1.6689300537109374e-07
<generator object Estimator.predict at 0x7f960c90b9e8>

打印输出时候的日志

INFO:tensorflow:Using config: {'_model_dir': 'models/cifar10_cnn_model', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': gpu_options {
  per_process_gpu_memory_fraction: 0.4
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fce5aa21898>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
1.7.0
INFO:tensorflow:Calling model_fn.
WARNING:tensorflow:From /root/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
2018-05-02 05:50:05.191205: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-05-02 05:50:07.243276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:89:00.0
totalMemory: 10.92GiB freeMemory: 10.75GiB
2018-05-02 05:50:07.243345: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-05-02 05:50:08.061795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-05-02 05:50:08.061833: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0
2018-05-02 05:50:08.061843: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N
2018-05-02 05:50:08.062271: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4471 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:89:00.0, compute capability: 6.1)
INFO:tensorflow:Restoring parameters from models/cifar10_cnn_model/model.ckpt-7813
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
{'classes': 7, 'probabilities': array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 1.0698529e-28,
       3.5165885e-24, 0.0000000e+00, 0.0000000e+00, 1.0000000e+00,
       0.0000000e+00, 0.0000000e+00], dtype=float32)}
{'classes': 4, 'probabilities': array([3.8984444e-02, 1.7290542e-02, 7.2144612e-05, 2.1279871e-02,
       8.7282175e-01, 7.9489204e-05, 3.9473740e-05, 8.1507285e-05,
       3.8754847e-02, 1.0595908e-02], dtype=float32)}
{'classes': 8, 'probabilities': array([0.030588  , 0.00331565, 0.01806127, 0.16966306, 0.11403419,
       0.00940617, 0.00433081, 0.00933796, 0.6280342 , 0.01322871],
      dtype=float32)}
{'classes': 8, 'probabilities': array([1.0796245e-04, 4.3053915e-06, 1.0054865e-05, 2.1660354e-04,
       4.5051434e-05, 5.7216744e-06, 2.4732158e-06, 1.6087660e-06,
       9.9959379e-01, 1.2398613e-05], dtype=float32)}
{'classes': 0, 'probabilities': array([9.9999940e-01, 1.6012014e-26, 2.7089411e-09, 2.6060352e-11,
       4.2905840e-10, 1.0565774e-21, 1.1225464e-12, 5.6888535e-07,
       1.5357675e-16, 1.2642179e-24], dtype=float32)}
{'classes': 8, 'probabilities': array([3.4681739e-07, 9.7503056e-22, 2.3702473e-06, 1.7765029e-05,
       1.2646311e-04, 8.8681292e-17, 4.1720746e-08, 2.4228942e-05,
       9.9982870e-01, 4.1513161e-20], dtype=float32)}
{'classes': 7, 'probabilities': array([7.0063214e-07, 8.9882218e-30, 1.2738653e-11, 6.1955408e-04,
       8.7317900e-04, 1.0365757e-15, 9.8804117e-07, 9.9850559e-01,
       7.2258430e-09, 1.0738620e-24], dtype=float32)}
{'classes': 2, 'probabilities': array([6.2257884e-04, 4.8029875e-07, 9.2451119e-01, 5.4313014e-03,
       5.4636996e-02, 2.0337410e-04, 1.4482945e-02, 9.6699063e-05,
       1.3326620e-05, 1.0561298e-06], dtype=float32)}
{'classes': 7, 'probabilities': array([4.5048432e-31, 0.0000000e+00, 6.6054202e-36, 4.7261850e-25,
       3.0333325e-21, 0.0000000e+00, 1.3007781e-37, 1.0000000e+00,
       0.0000000e+00, 0.0000000e+00], dtype=float32)}
{'classes': 3, 'probabilities': array([1.4005458e-03, 1.2714723e-05, 4.3188229e-06, 9.0923738e-01,
       8.6951755e-02, 1.8374063e-05, 6.0670868e-07, 1.4955220e-04,
       2.1944416e-03, 3.0232490e-05], dtype=float32)}
0.8427124261856079
<generator object Estimator.predict at 0x7fce5aa8b938>

不打印输出时,感觉计算图都没有执行

@WenmuZhou 你可以看到 result 是一个 generator,即生成器,生成器默认是不执行的,只有调用 __next__ 方法时才会执行。这里如果你想得到一个列表而不是生成器的话,你可以使用

result = list(result)

这样看来,reslt就是计算图的一部分。
我将mian函数改成这样之后
``python`
result = infer()
start = time.time()
print(list(result))
print(time.time()-start)

日志变成了这样
```sh
INFO:tensorflow:Using config: {'_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fed40619a90>, '_num_ps_replicas': 0, '_master': '', '_keep_checkpoint_every_n_hours': 10000, '_tf_random_seed': None, '_is_chief': True, '_task_id': 0, '_task_type': 'worker', '_model_dir': 'models/cifar10_cnn_model', '_session_config': gpu_options {
  per_process_gpu_memory_fraction: 0.4
}
, '_keep_checkpoint_max': 5, '_service': None, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_num_worker_replicas': 1, '_log_step_count_steps': 100, '_save_checkpoints_steps': None}
1.4.0
2.6226043701171877e-07
WARNING:tensorflow:Input graph does not contain a QueueRunner. That means predict yields forever. This is probably a mistake.
2018-05-04 10:52:09.255450: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-05-04 10:52:11.408233: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: 
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:89:00.0
totalMemory: 10.92GiB freeMemory: 10.74GiB
2018-05-04 10:52:11.408301: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:89:00.0, compute capability: 6.1)
INFO:tensorflow:Restoring parameters from models/cifar10_cnn_model/model.ckpt-7813
[{'classes': 7, 'probabilities': array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 1.0698529e-28,
       3.5165885e-24, 0.0000000e+00, 0.0000000e+00, 1.0000000e+00,
       0.0000000e+00, 0.0000000e+00], dtype=float32)}, {'classes': 4, 'probabilities': array([3.8984444e-02, 1.7290542e-02, 7.2144612e-05, 2.1279871e-02,
       8.7282175e-01, 7.9489204e-05, 3.9473740e-05, 8.1507285e-05,
       3.8754847e-02, 1.0595908e-02], dtype=float32)}, {'classes': 8, 'probabilities': array([0.030588  , 0.00331565, 0.01806127, 0.16966306, 0.11403419,
       0.00940617, 0.00433081, 0.00933796, 0.6280342 , 0.01322871],
      dtype=float32)}, {'classes': 8, 'probabilities': array([1.0796245e-04, 4.3053915e-06, 1.0054865e-05, 2.1660354e-04,
       4.5051434e-05, 5.7216744e-06, 2.4732158e-06, 1.6087660e-06,
       9.9959379e-01, 1.2398613e-05], dtype=float32)}, {'classes': 0, 'probabilities': array([9.9999940e-01, 1.6012014e-26, 2.7089411e-09, 2.6060352e-11,
       4.2905840e-10, 1.0565774e-21, 1.1225464e-12, 5.6888535e-07,
       1.5357675e-16, 1.2642179e-24], dtype=float32)}, {'classes': 8, 'probabilities': array([3.4681739e-07, 9.7503056e-22, 2.3702473e-06, 1.7765029e-05,
       1.2646311e-04, 8.8681292e-17, 4.1720746e-08, 2.4228942e-05,
       9.9982870e-01, 4.1513161e-20], dtype=float32)}, {'classes': 7, 'probabilities': array([7.0063214e-07, 8.9882218e-30, 1.2738653e-11, 6.1955408e-04,
       8.7317900e-04, 1.0365757e-15, 9.8804117e-07, 9.9850559e-01,
       7.2258430e-09, 1.0738620e-24], dtype=float32)}, {'classes': 2, 'probabilities': array([6.2257884e-04, 4.8029875e-07, 9.2451119e-01, 5.4313014e-03,
       5.4636996e-02, 2.0337410e-04, 1.4482945e-02, 9.6699063e-05,
       1.3326620e-05, 1.0561298e-06], dtype=float32)}, {'classes': 7, 'probabilities': array([4.5048432e-31, 0.0000000e+00, 6.6054202e-36, 4.7261850e-25,
       3.0333325e-21, 0.0000000e+00, 1.3007781e-37, 1.0000000e+00,
       0.0000000e+00, 0.0000000e+00], dtype=float32)}, {'classes': 3, 'probabilities': array([1.4005458e-03, 1.2714723e-05, 4.3188229e-06, 9.0923738e-01,
       8.6951755e-02, 1.8374063e-05, 6.0670868e-07, 1.4955220e-04,
       2.1944416e-03, 3.0232490e-05], dtype=float32)}]
15.005367040634155

也就是只有调用result的时候才会执行运算

@WenmuZhou 严格来说这是生成器在起作用,只有在 __next__ 方法被调用才会被执行。

所以说result还是被嵌在计算图内部的

看起来已经解决这个问题了,即将关闭这个 issue,有问题欢迎再提。