why only mask the last channel?
Suyuanhang opened this issue · 1 comments
Suyuanhang commented
I noticed that in create_first/other_mask functions in probclass.py, you only set the causality mask on the last one on the C or D channel, which is different to Algo 1 in the supplemental material, where causality are supposed to be set across the C/D.
In other words, why not
mask[:, K // 2, K // 2:] instead of mask[-1, K // 2, K // 2:] ?
Thank you
fab-jul commented
I'm not sure I understand fully. But the idea is that we can use use the information of all previous channels when encoding/decoding, since we encode/decode channel by channel. Only within a channel we have to be careful, sine we do a raster scan.