brian-team/brian2modelfitting

Investigate using dimensionality-reduction methods on outputs

Closed this issue · 7 comments

Instead of asking the user to provide metrics to extract features, it might be possible to automatically reduce the dimensionality of the output (e.g. voltage trace)?

It is possible, check out the figure below (Fig. 1.B in [1]):
image
The authors in [1] claim, "A minimally invasive extension of (approximate Bayesian computation and classical density estimation-based inference methods) is to first learn summary statistics that have certain optimality properties, before running a standard inference algorithm..."
There are different methods to automatically construct summary statistics, see [2, 3] for example.
On the other hand, there is an interesting, brief discussion in [4] where authors claim that simulation-based inference algorithms such as sequential neural posterior estimation (SNPE) can be applied directly to raw data, or to high-dimensional summary features.
Additionally, one of the examples in the official sbi documentation covers learning summary statistics by using an embedding neural network prior to simulation, but here the data of interest are not time-series as encountered when fitting electrophysiological data.

I would say that it is a good idea to stick with an expert-defined set of summary features for now, it would be straightforward to implement this automatic approach if necessary.
We could make use of the existing function in brian2modelfitting, calc_eFEL, which takes advantage of eFEL package.
We could also manually calculate additional features if requested by the user.

[1] Cranmer, Breher and Louppe. PNAS (2020) 117:30055-30062
[2] Jiang et al. Statistica Sinica (2017) 1595–1618
[3] Izbicki, Lee and Pospisil. Journal of Computational and Graphical Statistics (2019) 28:481-492
[4] Gonçalves et al. eLife (2020) 9:e56261

Thanks for looking into this and the references, I'll try to have a closer look soon. Since this is non-trivial (but very interesting!), I agree that we should focus on user-provided summary features first. Interesting note about applying the network directly to the high-dimensional data, this might be something that we can try out easily rather soon.

Best regards

I am trying to make a comparison of the incremental versions of dimensionality reduction methods. Could someone tell me where I can find the code for Incremental Locally Linear Embedding, or Incremental Multidimensional Scaling or Incremental Laplacian EigenMaps ?

Thank you.

Hi @endimeon777,

Regarding the code for the techniques you mentioned, I really have no idea. However, I am not sure if those methods are even applicable for the data we are handling here.

Best,
Ante

Hi @mstimberg .

I've decided to play around with the issue of using raw data directly without feature extraction (see my first comment):

On the other hand, there is an interesting, brief discussion in [4] where authors claim that simulation-based inference algorithms such as sequential neural posterior estimation (SNPE) can be applied directly to raw data, or to high-dimensional summary features.

The authors of the mentioned study [Gonçalves et al., 2020], state the following:

SNPE can be applied to, and might benefit from the use of summary features, but it also makes use of the ability of neural networks to automatically learn informative features in high-dimensional data. Thus, SNPE can also be applied directly to raw data (e.g. using recurrent neural networks [Lueckmann et al., 2017]), or to high-dimensional summary features which are challenging for ABC approaches (...).

The procedure is as follows: when SNPE is used in combination with MDN, MDN is augmented with a RNN which runs along the roecorded voltage trace to learn appropriate features and thus constrain the model parameters. I am not sure if this happens automatically when the output dimension is large, but it works.
Check out this notebook where I compared inference procedure for g_Na and g_K when using summary features (the same thing you did in the brian2 official examples here) and with the raw output.

In model fitting toolbox, we could probably check whether the neural posterior class belongs to SNPE and whether the density estimator function is MDN. If this is true, then users does not have to provide list (or dictionary) of features w.r.t. which the inference will be performed if they are only interested in "fitting" the parameters and no special features are of interest to them.

Continuation of the discussion started in the previous comment: sbi-dev/sbi#527 + some additional info.

Since I will deal with #53, I can also enable empty list/dict for features argument in the constructor of the Inferencer in the same pull request. If the a list/dict of features is empty or set to None, SNPE will extract features automatically either by using either user-provided embedded network or by simple MLP provided in sbi by default. For simple problems, like the one showcased here, default MLP is doing more than a good job. For more complex problems, the user will probably have to utilize more complex recurrent neural nets, e.g., LSTM or GRU.

With this last PR merged in sbi_support we are able to do automatic feature extraction without providing list of features to Inferencer. Automatic feature extraction will happen automatically by training MLP, if other embedding network is not provided to infer method.
I will close this issue now.