ilivans/tf-rnn-attention

Can't see any graph/scalar in tensorboard

KaushikNathMIT opened this issue · 7 comments

Can't see any graph/scalar in tensorboard

@HudsonHuang can you have a look at this pls?

It doesn't occur to me.
I suppose there might be 2 reasons:

  1. In this repo, logging files will only be generated after training progress.
    So, firstly, you have to run
    python train.py for training.
    And then you can run
    tensorboard --logdir=./logdir
    to glace the scalar.

  2. The parameter "--logdir" of tensorborad might be wrong.
    When you run :
    tensorboard --logdir=./logdir
    It set the path of logging files to "./logdir" , so you must run this command under the same folder of "train.py".

I would be happy to help you if you gave me more information.

I ran perfectly in this way:

Administrator MINGW64 ~/Desktop/temp/tf-rnn-attention (master) $ python train.py Start learning... epoch: 0 C:\Users\Administrator\Anaconda3\lib\site-packages\h5py\__init__ .py:36: FutureWarning: Conversion of the second argument of issubdtype fromflo attonp.floatingis deprecated. In future, it will be treated asnp.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
100%|▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒| 97/97 [04:01<00:00, 2.49s/it]
100%|▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒| 97/97 [02:20<00:00, 1.45s/it]
loss: 0.456, val_loss: 0.443, acc: 0.702, val_acc: 0.796
100%|▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒| 97/97 [04:24<00:00, 2.72s/it]
100%|▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒| 97/97 [02:21<00:00, 1.46s/it]
loss: 0.306, val_loss: 0.349, acc: 0.843, val_acc: 0.846
100%|▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒| 97/97 [04:31<00:00, 2.80s/it]
100%|▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒| 97/97 [02:21<00:00, 1.46s/it]
loss: 0.259, val_loss: 0.319, acc: 0.888, val_acc: 0.862
Run 'tensorboard --logdir=./logdir' to checkout tensorboard logs.
2018-04-18 12:35:50.441207: I C:\tf_jenkins\workspace\rel-win\M\windows\PY\36\te nsorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2

Administrator MINGW64 ~/Desktop/temp/tf-rnn-attention (master)
$ tensorboard --logdir=./logdir
c:\users\administrator\anaconda3\lib\site-packages\h5py_init_.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
TensorBoard 1.6.0 at http://YYINC:6006 (Press CTRL+C to quit)
`

I followed the exact steps as mentioned. I got the losses and accuracy to be correct. But when I open tensorboard I can only see projector tab. There is no scalar or any other tab.
I did a bit of debugging and found out that in the summary, I can see Metrics/loss and Metrics/accuracy. But somehow the same isn't reflected in tensorboard.

  1. Provide us with your tensorflow & tensorboard versions
  2. Ensure that ./logdir/train folder contains a logging file. If it does, reload the tensorboard page.
  3. What do you mean 'There is no scalar tab'? All the tabs must be in a drop-down list next to the 'Projector' tab even if logdir is empty

If I click on the scalar tab no data is shown.
My tensorflow version is 1.7

This are the events i am getting.

Processing event files... (this can take a few minutes)

Found event files in:
logdir/test
logdir/train

These tags are in logdir/test:
audio -
histograms
Attention_layer/alphas_1
Embedding_layer/embeddings_var
Fully_connected_layer/W
RNN_outputs
images -
scalars
Metrics/accuracy
Metrics/loss
accuracy
loss
tensor -

Event statistics for logdir/test:
audio -
graph
first_step 0
last_step 0
max_step 0
min_step 0
num_steps 1
outoforder_steps []
histograms
first_step 0
last_step 96
max_step 290
min_step 0
num_steps 291
outoforder_steps [(290L, 0L), (290L, 0L)]
images -
scalars
first_step 0
last_step 96
max_step 290
min_step 0
num_steps 291
outoforder_steps [(290L, 0L), (290L, 0L)]
sessionlog:checkpoint -
sessionlog:start -
sessionlog:stop -
tensor -

These tags are in logdir/train:
audio -
histograms
Attention_layer/alphas_1
Embedding_layer/embeddings_var
Fully_connected_layer/W
RNN_outputs
images -
scalars
Metrics/accuracy
Metrics/loss
accuracy
loss
tensor -

Event statistics for logdir/train:
audio -
graph
first_step 0
last_step 0
max_step 0
min_step 0
num_steps 1
outoforder_steps []
histograms
first_step 0
last_step 31
max_step 290
min_step 0
num_steps 291
outoforder_steps [(290L, 0L), (290L, 0L), (290L, 0L), (102L, 0L)]
images -
scalars
first_step 0
last_step 31
max_step 290
min_step 0
num_steps 291
outoforder_steps [(290L, 0L), (290L, 0L), (290L, 0L), (102L, 0L)]
sessionlog:checkpoint -
sessionlog:start -
sessionlog:stop -
tensor -

So expected tags are in the directory indeed. And I successfully used this code with TF 1.7 as well as @HudsonHuang, so the problem is probably with how you use tensorboard. I don't think we can help here.
Have a look at this issue tensorflow/tensorboard#456, not sure though this is your case if you followed the steps @HudsonHuang mentioned.
I would suggest you to ask the question at StackOverflow or TF repo with all the details provided.