frgfm/torch-scan

Receptive Field & Dilation

Closed this issue · 5 comments

This library is truly amazing !

In my search for a library to calculate the receptive field size, I was really excited to find this library.
(I especially appreciate the support for Conv1d.)

But I have two questions.

  1. it does not seem to support Dilation.
  2. The receptive field is smaller closer to the input layer and larger closer to the output layer, but the library is showing the opposite.

P.S.

I've written a patch for both of them, but there is a problem that the receptive field size is not properly displayed when the max_depth is reduced...

# Applied patch

import torch.nn as nn
from torchscan import summary


m = nn.Sequential(
    nn.Conv1d(1, 1, 3, dilation=1),
    nn.ReLU(),
    nn.Conv1d(1, 1, 3, dilation=2),
    nn.ReLU(),
    nn.Conv1d(1, 1, 3, dilation=3)
)

n = nn.Sequential(
    nn.Conv1d(1, 1, 3, dilation=1),
    nn.ReLU(),
    nn.Conv1d(1, 1, 3, dilation=2),
    nn.ReLU(),
    nn.Conv1d(1, 1, 3, dilation=3)
)

l = nn.Sequential(
    m, nn.ReLU(), n
)

summary(l, (1, 100), receptive_field=True, max_depth=2)
# max_depth=2
____________________________________________________________________________________________________________
Layer                        Type                  Output Shape              Param #         Receptive field
============================================================================================================
sequential                   Sequential            (-1, 1, 76)               0               1              
├─0                          Sequential            (-1, 1, 88)               0               13             
|    └─0                     Conv1d                (-1, 1, 98)               4               3              
|    └─1                     ReLU                  (-1, 1, 98)               0               3              
|    └─2                     Conv1d                (-1, 1, 94)               4               7              
|    └─3                     ReLU                  (-1, 1, 94)               0               7              
|    └─4                     Conv1d                (-1, 1, 88)               4               13             
├─1                          ReLU                  (-1, 1, 88)               0               13             
├─2                          Sequential            (-1, 1, 76)               0               25             
|    └─0                     Conv1d                (-1, 1, 86)               4               15             
|    └─1                     ReLU                  (-1, 1, 86)               0               15             
|    └─2                     Conv1d                (-1, 1, 82)               4               19             
|    └─3                     ReLU                  (-1, 1, 82)               0               19             
|    └─4                     Conv1d                (-1, 1, 76)               4               25             
============================================================================================================
max_depth=1
____________________________________________________________________________________________________________
Layer                        Type                  Output Shape              Param #         Receptive field
============================================================================================================
sequential                   Sequential            (-1, 1, 76)               0               1              
├─0                          Sequential            (-1, 1, 88)               12              3              
├─1                          ReLU                  (-1, 1, 88)               0               13             
├─2                          Sequential            (-1, 1, 76)               12              15             
============================================================================================================
max_depth=0
____________________________________________________________________________________________________________
Layer                        Type                  Output Shape              Param #         Receptive field
============================================================================================================
sequential                   Sequential            (-1, 1, 76)               24              13             
============================================================================================================
frgfm commented

Hi @khirotaka,

Glad it was helpful and thanks for the issue!
Regarding dilation, you are correct. I actually was trying to come up with a more modularized way to handle this. If you do have a suggestion, I'm more than happy to discuss a PR 🙏

Regarding the receptive field, it's actually a design choice I made earlier:

  • the receptive field value the library outputs for each layer represents the receptive field relative to the output of the model, rather than its input.

This might be misleading but the idea was to be able to trace back the relative spatial dependency to a node in the output layer. However, I understand this can be confusing and perhaps it would be best to add an option to specify whether it is relative to the output of the model or the input (and to set the default in the order you suggested).

Regarding the reduction process, I actually implemented it to work using the receptive field increasing order of my initial PR over there: https://github.com/frgfm/torch-scan/blob/master/torchscan/utils.py#L175-L177

So we are to move forward with PRs, I would suggest splitting them up:

  • option to take input or output and receptive field basis, and adapt the reduction accordingly
  • support of dilation in conv layers

I'll look into this, but let me know if you want to help / submit a PR 👍

frgfm commented

Hey @khirotaka,

Actually, I didn't manage to really fix the receptive field computation as the values are really bound to be computed reversely (cf. https://distill.pub/2019/computing-receptive-fields/) for the whole model. If you have a draft that managed to get a 212 value on torchvision.models.vgg16, I'd be happy to take a look at it.

I tried a few things but it was far from ideal (had to reverse some changes from #31 in #33)

Hey! I'm going to share my idea for a patch. I hope this helps a little bit...

khirotaka@e0932c2

frgfm commented

Hi @khirotaka,

Thanks for the suggestion! I actually managed to handle this by reversing the recurrence equations in #34 👌
For the dilation part, I only integrated into the module kernel size to avoid changing the interface. It's working like a charm!

Thanks again 🙏

Thank you as well! I'm looking forward to the latest release!