Discrepancies vs Table A1 in paper
alexander-soare opened this issue · 3 comments
alexander-soare commented
I noticed some possible discrepancies of the architecture parameters here vs those in table A1 of the paper
For ImageNet models, is it correct that:
- The table should say
h=[3,3,4]
? - The order of the
scale_hidden_dims
in the table is inverted. That is, hierarchies 1, 2 and 3 should say[4d, 4h] × 2, 1 [2d, 2h] × 2, 4 [d, h] × k, 16
?
zizhaozhang commented
Hi @alexander-soare I think you are right. Thanks for spotting the details! We will correct in the next version.
alexander-soare commented
@zizhaozhang thanks for confirming! I'm working on a PyTorch implementation
alexander-soare commented
@zizhaozhang FYI I just finished converting the weights and I can also confirm that the [4d, 4h] × k, 1 [2d, 2h] × 2, 4 [d, h] × 2, 16