davidhallac/TICC

Warning and Can't converge problem

Closed this issue · 8 comments

Hi David,

I applied your algorithm to my data set and found two issues:

  1. I sometimes obtain the warning like this:
    /usr/lib/python2.7/dist-packages/numpy/lib/function_base.py:2476: RuntimeWarning: Degrees of freedom <= 0 for slice
    warnings.warn("Degrees of freedom <= 0 for slice", RuntimeWarning)
    Is it serious warning? should I stop or just ignore it.

  2. Oscillation between solutions
    ('\n\n\nITERATION ###', 15)
    ('OPTIMIZATION for Cluster #', 0, 'DONE!!!')
    ('OPTIMIZATION for Cluster #', 2, 'DONE!!!')
    ('OPTIMIZATION for Cluster #', 3, 'DONE!!!')
    ('length of the cluster ', 0, '------>', 739)
    ('length of the cluster ', 1, '------>', 0)
    ('length of the cluster ', 2, '------>', 20)
    ('length of the cluster ', 3, '------>', 20)
    ('length of the cluster ', 4, '------>', 0)
    UPDATED THE OLD COVARIANCE
    beginning the smoothening ALGORITHM
    ('cluster that is zero is:', 1, 'selected cluster instead is:', 0)
    ('cluster that is zero is:', 4, 'selected cluster instead is:', 3)
    ('length of cluster #', 0, '-------->', 739)
    ('length of cluster #', 1, '-------->', 20)
    ('length of cluster #', 2, '-------->', 0)
    ('length of cluster #', 3, '-------->', 0)
    ('length of cluster #', 4, '-------->', 20)
    Done writing the figure

    ('\n\n\nITERATION ###', 16)
    ('OPTIMIZATION for Cluster #', 0, 'DONE!!!')
    ('OPTIMIZATION for Cluster #', 1, 'DONE!!!')
    ('OPTIMIZATION for Cluster #', 4, 'DONE!!!')
    ('length of the cluster ', 0, '------>', 739)
    ('length of the cluster ', 1, '------>', 20)
    ('length of the cluster ', 2, '------>', 0)
    ('length of the cluster ', 3, '------>', 0)
    ('length of the cluster ', 4, '------>', 20)
    UPDATED THE OLD COVARIANCE
    beginning the smoothening ALGORITHM
    ('cluster that is zero is:', 2, 'selected cluster instead is:', 0)
    ('cluster that is zero is:', 3, 'selected cluster instead is:', 4)
    ('length of cluster #', 0, '-------->', 739)
    ('length of cluster #', 1, '-------->', 0)
    ('length of cluster #', 2, '-------->', 20)
    ('length of cluster #', 3, '-------->', 20)
    ('length of cluster #', 4, '-------->', 0)
    Done writing the figure

    The runs continuously oscillate between two outcomes (739, 0, 20, 20, 0) and (739, 20, 0, 0, 20).
    Should I stop the run and choose one of them? What happens here and how to avoid this?

Thanks,

  1. Are you sure you’ve pulled the newest version of the solver on Github? We’ve pushed some bug fixes in the past month or two that may have caused this

  2. I think you can ignore the warning

  3. That oscillation is because the TICC algorithm is not converging (since it’s non-convex, there is no guarantee of convergence in certain scenarios). Typically, I’ve seen it occur with poorly selected hyper-parameters. Can you try increasing lambda and decreasing beta and let us know if that fixes it?

It's still not converging after I lowered beta and increased lambda. The last beta I did was 10 and lambda was 0.1

Are you still getting the same oscillations? Depending on your dataset, it's possible that you'll need to lower beta even lower (such as 0.1) to get it to converge.

I lowered the beta to 0.1 and it still does not converge.

Hmm, would it be possible to share the dataset you're running it on? It would be easier to debug if I could play with the exact data to recreate the error. If not, though, can you provide details on the dataset size/dimensions/etc, and the exact command you're running (with all the hyperparameter values)

Also, does it converge when beta = 0?

No, it does not converge when beta=0. I also sent you via email the dataset that I'm working on.
Thanks

Just in case anyone else is running into this same issue, after speaking with Vu, it turns out the problem was the the different sensors had extremely different scales (from 0.1 to 1e16), which caused the solver to run into some numerical issues. Normalizing the data fixed this error!

I'm getting different results when I use normalized data to when I use the data without normalization. What can that be attributed to?