Why does ww_t have nb_reads components?
markpwoodward opened this issue · 4 comments
Thank you @tristandeleu for this library, it has helped me better understand the paper. I am implementing a tensorflow version of the ntm-lrua model, and I have a question about your implementation.
Why do W_add
, b_add
, a_t
, sigma_t
, ww_t
all have nb_reads
elements? The paper seems to have only one "write head" as I gathered from the text and Figure 7.
The paper does explicitly talk about wlu_tm1
containing nb_reads
1's, which would mean we write the single a_t
identically to nb_reads
locations. That doesn't seem to make sense.
Any thoughts would be greatly appreciated. Thank you
Hey Mark, I'm glad the code helped you! Indeed I may have taken some liberties from the paper:
- Figure 7 indeed hints at having only one single write head. If I remember correctly, I followed Equation 7 which suggests that every read head
wr_tm1
has a correspondingww_t
. Similarly I considered thatwlu_tm1
was not a single vector withnb_heads
1's butnb_heads
one hot vectors, to ensure thatww_t
remains a proper distribution (sums to 1). - This also explains why
sigma_t
hasnb_reads
elements as well ; one for each write head. - The paper doesn't explicitly separate
k_t
anda_t
. I did that to match the NTM paper, which separates the key to query the memory from what is added to the memory. Again these havenb_reads
elements as well for each read/write head.
Hope it makes more sense!
Hi Tristan, Thanks again. That all makes sense. Your choices seem like the best fit to the paper to me.
Hi @tristandeleu. I'm trying to understand the work of one shot learning by Google and found your project to be a nice learning material. But I'm a little confused by the usage of omniglot.py and test_model.py.
It seems the controller of MANN has to be firstly trained in the hard way. Does the training in omniglot.py means this pre-stage training?
I don't understand how test_batch_size() and test_shape() are related to the description in the paper. Would you please give some instructions on that or add more comments in the code?
Thanks in advance!
Hi @markpwoodward,can you execute this program? i met two errors in my issue,could you offer me some help? thank you so much.😭