ManchesterBioinference/BranchedGP

Error in in GetInitialConditionsAndPrior

Closed this issue · 8 comments

Hi,

I'm trying out the package on a dataset where we are expecting a bifurcation.
However, I am getting an error in the FitModel function I cannot diagnose myself.
It would be great if you could point me to the cause.

Here is my code:

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import BranchedGP
import time

Y = pd.read_table('countsCenteredBranchedGP.txt', index_col=0, sep=' ')
monocle = pd.read_table('monocleDataBranchedGP.txt', index_col=0, sep=' ')
genelist = ['G17339','G4231','G112']
f, ax = plt.subplots(1, len(genelist), figsize=(10, 5), sharex=True, sharey=True)

g='G112'
Bsearch = list(np.linspace(0.05, 0.95, 5)) + [1.1]  # set of candidate branching points
GPy = (Y[g])[:, None]
GPt = monocle['StretchedPseudotime']
globalBranching = monocle['State']
d = BranchedGP.FitBranchingModel.FitModel(Bsearch, GPt, GPy, globalBranching)

Which produces the following error:

>>> d = BranchedGP.FitBranchingModel.FitModel(Bsearch, GPt, GPy, globalBranching)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Applications/miniconda3/lib/python3.6/site-packages/BranchedGP/FitBranchingModel.py", line 38, in FitModel
    phiInitial, phiPrior = GetInitialConditionsAndPrior(globalBranching, priorConfidence, infPriorPhi=True)
  File "/Applications/miniconda3/lib/python3.6/site-packages/BranchedGP/FitBranchingModel.py", line 175, in GetInitialConditionsAndPrior
    phiPrior[i, int(iBranch)] = v

You can download the data here:
https://www.dropbox.com/sh/furudfp37d5f1o6/AAC0VFV5OFZcFUTpWoZ7HBY9a?dl=1

Best,
Koen

@sumonahmedUoM can you reproduce this issue?

@koenvandenberge thanks for trying it our code and sorry you hit an issue. Could you send the full error trace?

Hi,

Thank you for the reply.
Unless I misinterpreted 'full error trace', the error stack is as mentioned above:

>>> traceback.print_exc(BranchedGP.FitBranchingModel.FitModel(Bsearch, GPt, GPy, globalBranching))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Applications/miniconda3/lib/python3.6/site-packages/BranchedGP/FitBranchingModel.py", line 38, in FitModel
    phiInitial, phiPrior = GetInitialConditionsAndPrior(globalBranching, priorConfidence, infPriorPhi=True)
  File "/Applications/miniconda3/lib/python3.6/site-packages/BranchedGP/FitBranchingModel.py", line 175, in GetInitialConditionsAndPrior
    phiPrior[i, int(iBranch)] = v
IndexError: index 3 is out of bounds for axis 1 with size 2

Best,
Koen

Hi,
I found two issues:

  1. Right now, this pakcage works only for a single branching point. Thus np.unique(globalBranching) should give you array([1, 2, 3]).
  2. Pseudotime (GPt) should be in the range [0, 1].

After fixing this with your data, the code should work. I thied the following and it worked.

g='G729'
Bsearch = [0.05] # set of candidate branching points
ns = 1
GPy = (Y[g].iloc[::ns].values - Y[g].iloc[::ns].values.mean())[:, None]
GPt = monocle['StretchedPseudotime'].values[::ns]
globalBranching = monocle['State'].values[::ns].astype(int)
d = BranchedGP.FitBranchingModel.FitModel(Bsearch, GPt, GPy, globalBranching)

INFO:tensorflow:Optimization terminated with:
Message: b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
Objective function value: 558438.672746
Number of iterations: 85
Number of functions evaluations: 113

You may face numerical issues and find the following error:
InvalidArgumentError (see above for traceback): Input is not invertible.
[[Node: 222/AssignGPSparse/f1f2/invKbb = MatrixInverseT=DT_DOUBLE, adjoint=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

To solve this, try to use a higher jitter level,
from gpflow import settings
settings.numerics.jitter_level = 1e-3

Hope it helps.

Hi,

Many thanks for the help.
I have now updated the monocle States to reflect the bifurcation. The updated file can be found with the Dropbox link above.
I have also updated the estimated peudotime, GPt = monocle['StretchedPseudotime'] / monocle['StretchedPseudotime'].max().
However, I am getting another error that is hard to interpret:

>>> d = BranchedGP.FitBranchingModel.FitModel(Bsearch, GPt, GPy, globalBranching)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Applications/miniconda3/lib/python3.6/site-packages/BranchedGP/FitBranchingModel.py", line 60, in FitModel
    np.ones((1, 1)) * ptb, ZExpanded, phiInitial=phiInitial, phiPrior=phiPrior)
  File "/Applications/miniconda3/lib/python3.6/site-packages/gpflow/core/compilable.py", line 86, in __init__
    origin_init(self, *args, **kwargs)
  File "/Applications/miniconda3/lib/python3.6/site-packages/BranchedGP/assigngp_denseSparse.py", line 34, in __init__
    phiInitial=phiInitial, phiPrior=phiPrior)
  File "/Applications/miniconda3/lib/python3.6/site-packages/gpflow/core/compilable.py", line 86, in __init__
    origin_init(self, *args, **kwargs)
  File "/Applications/miniconda3/lib/python3.6/site-packages/BranchedGP/assigngp_dense.py", line 55, in __init__
    self.UpdateBranchingPoint(b, phiInitial, prior=phiPrior)
  File "/Applications/miniconda3/lib/python3.6/site-packages/BranchedGP/assigngp_dense.py", line 72, in UpdateBranchingPoint
    self.pZ = pZ_construction_singleBP.expand_pZ0PureNumpyZeros(self.eZ0, b, self.t)
  File "/Applications/miniconda3/lib/python3.6/site-packages/BranchedGP/pZ_construction_singleBP.py", line 26, in expand_pZ0PureNumpyZeros
    i = np.flatnonzero(X <= BP)
  File "/Applications/miniconda3/lib/python3.6/site-packages/pandas/core/ops.py", line 846, in wrapper
    raise ValueError('Lengths must match to compare')
ValueError: Lengths must match to compare

It looks like some objects don't have the same length, but the input looks ok:

>>> GPt.shape
(925,)
>>> GPy.shape
(925, 1)
>>> globalBranching.shape
(925,)

Let me know if you would require any additional information.

Best,
Koen

Hi,
Please replace the following two lines of code

GPt = monocle['StretchedPseudotime']
globalBranching = monocle['State']

with

GPt = monocle['StretchedPseudotime'].values
globalBranching = monocle['State'].values

and it will work. I have just run your code and attached an image :simple_smile:
g729

The type should be:
numpy.ndarray

Great, this seems to work.
Thanks so much for the help.
Best,
Koen

Hi Koen,

Small point - you need many more than 5 times to get a decent approximation to the posterior, e.g. 20 would be more sensible.

Magnus.