Note that a latex browser extension (like mathjax for github) may be necessary for displaying correctly the following text (Ex: pi=$\pi$ should show pi and the symbol for pi).
The model
The feature-weighted receptive field is a new approach to building voxelwise encoding models for visual brain areas. The results of this study suggest that the fwRF modeling approach can be used to achieve the performance goals of expressiveness, scalability, interpretability and compatibility laid out in details in the paper. The key design principle of the fwRF modeling approach is space-feature separability, which endows the model with an explicit receptive field-like component that facilitates interpretation, and makes it possible to scale the number of feature maps in the model without incurring a per-pixel increase in model parameters. We find that when this approach is applied to a deep neural network with thousands of feature maps, the resulting encoding model achieves better prediction accuracy than comparable encoding models for most voxels in the visual system.
Figure 1: The fwRF model.
(A) A schematic illustration of a fwRF model for a single voxel (grey box on brain, top right). The fwRF predicts the brain activity measured in the voxel, $r$, in response to any visual stimulus, $S$ (bottom left). The stimulus is transformed into one or more feature maps (three feature maps, $\Phi_k$, $\Phi_l$, and $\Phi_m$, are shown in blue with pink borders). The choice of feature maps is entirely up to the user, and reflects her hypotheses about the visual features that are relevant to brain regions of interest. The resolution of the feature maps ($\Delta$, indicated by pink grids) can vary, although each feature map spans the same degree of visual angle as the stimulus $S$. Each feature map is filtered by a 2D Gaussian feature pooling field, $g$, that is sampled from a grid of candidate feature pooling fields (grey box at top left; candidate feature pooling field centers ($\mu_x,\mu_y$) are illustrated by the grid of black points, while candidate feature pooling field radii ($\sigma_\text{g}$) are illustrated by dashed circles). The feature pooling field radius and location are the same for each feature map. The output of the feature pooling filtering operation (illustrated as black dots in the center of the dashed feature pooling fields on each feature map) for each feature map is then weighted by a feature weight (black curves labeled $w_k$, $w_l$, $w_m$). These weighted outputs are summed to produce a prediction of the activity $r$. In the text we describe an algorithm for selecting the optimal feature pooling field and feature weights for each voxel. (B) Gabor wavelet feature maps are constructed by convolving the input images with complex Gabor wavelets followed by a compressive nonlinearity (see text for details). (C) Deepnet feature maps were extracted the layers (labeled $K_i$) of a deep convolutional network pre-trained to label images according to object category.