Questions about the details of the model

Question

Questions about the details of the model

Closed this issue 2 years ago · 6 comments

Dear authors,

I'm confused by the details of the model. Specifically, from my understanding, before the 2D conv in STRF Extractor, we need get the subband through function F.unfold, so we can calculate input tensor of the 2D conv, tensor shape is [B, F, T, 1, 31, 15]. And I guess after batchnorm, a full connect layer is needs, projecting tensor [B, F, T, 31 * 15] to [B, F, T, D1].

I don't know if I'm understanding this correctly, sincerely looking forward to your reply.

Best.

Answer 1 · 2023-01-06T02:41:30.000Z

Hi,

2D conv (self.conv = nn.Conv2d(1, D1, (31, 5))) will take the padded SxT (nn.functional.pad) as input.

Answer 2 · 2023-01-06T05:35:05.000Z

Got it, thanks for your reply,

Answer 3 · 2023-05-12T08:39:52.000Z

Hello author,
Can you help me with my doubts? Is the input size of the SubNet module [B*F, D_3, T] (D_3 is hidden_size)?

Answer 4 · 2023-05-12T08:58:56.000Z

Hello author, Can you help me with my doubts? Is the input size of the SubNet module [B*F, D_3, T] (D_3 is hidden_size)?

I think so.

Answer 5 · 2023-05-12T09:03:49.000Z

Hello author, Can you help me with my doubts? Is the input size of the SubNet module [B*F, D_3, T] (D_3 is hidden_size)?

I think so.

Ok thanks, this basically solved my confusion

Answer 6 · 2023-05-12T10:18:20.000Z

thanks, hopkin-ghp
yes, the input to the SubNet is [B x F, D_3, T], e.g., [32 x 481, 128, 401]