awslabs/mxboard

add_embedding() does not support a list of strings as labels

fhieber opened this issue · 7 comments

The documentation states that sw.add_embedding() can take a list of elements, convertible to string as an argument to the labels parameter.
However, the code prevents this both in add_embedding() and in _make_metadata_tsv() by throwing an error if labels is a List and not NDArray or NPArray.

Thank you very much for providing valuable feedbacks. I have fixed the bug and corrected the doc in this PR.
#12

More unit tests were also added.

Both master branch and pip package have been updated.

Please let me know if it works and close the issue if possible.

thank you for this great package! This should work now.

@reminisce One more question regarding add_embedding(): Is this supposed to be used multiple times or only at the end of a training run? My idea was to log various embedding parameters (source embedding, target embedding, output embedding) at each checkpoint by using the checkpoint as global_step, something like this:

sw.add_embedding(tag="source", embedding=embedding, labels=self.source_labels, global_step=checkpoint)
sw.add_embedding(tag="target", embedding=embedding, labels=self.target_labels, global_step=checkpoint)
sw.add_embedding(tag="output", embedding=embedding, labels=self.target_labels, global_step=checkpoint)

This seems to work, but I observe the following logging message at every checkpoint:

[WARNING:root] embedding dir exists, did you set global_step for add_embedding()?
[WARNING:root] embedding dir exists, did you set global_step for add_embedding()?

Presumably caused by the 2nd and 3rd call to sw_add_embedding(). The use the same checkpoint, but different tags.
Would it make sense to include the tag into the save_path in sw.add_embedding() (line 414 of writer.py)?

It also seems that at some point embeddings are now longer loaded to Tensorboard if too many exist. For example, I can only see up to 9 checkpoints in the "PROJECTOR" tab for each of the embedding types, whereas there are 27 checkpoint subfolders in my logging directory.
Do you know if Tensorboard has a limit on these?

I manually tested by concatenating the tag name and global_step as the folder name for one set of embeddings and it works. Thanks for the suggestion. I will make the change in the code accordingly.

Regarding the missing tabs, I tried 27 sets of embedding and all of them can display normally. Could you check the file projector_config.pbtxt under your logdir to count how many words embeddings are there? For each set of embedding, it should have the config information like the following. Please also note that the tensor_name in the following config should be unique. In my case, if I change two sets of embeddings to using the same tensor_name, two tabs show up and one of them is not effective. Let me know. Thanks.

 embeddings {
 tensor_name: "mnist000:00000"
 tensor_path: "00000/tensors.tsv"
 metadata_path: "00000/metadata.tsv"
 sprite {
 image_path: "00000/sprite.png"
 single_image_dim: 28
 single_image_dim: 28
 }
 }

@fhieber I have merged the PR #15 to resolve the naming conflict of embedding data.
The rule is:
If global_step=None, the folder name is just tag.
Else, the folder name is tag + '_' + str(global_step).zfill(6).
See doc here: https://github.com/awslabs/mxboard/blob/master/python/mxboard/writer.py#L389

You just need to either keep tag or combination of tag and global_step unique throughout training to prevent overwriting embedding data. Let me know if it serves for what you need.

thanks again, works perfectly!