davidhallac/TICC

Finding Optimal Parameters?

Closed this issue · 3 comments

Hi,

Could you describe the process by which you find optimal values of Lambda, Beta, Window Size, Number of Clusters etc. to pass into TICC for an arbitrary dataset? Currently, we are trying to brute force/grid search over a large set of parameter values and calculate BIC values for each set, then look for the minimum BIC value, but this doesn't seem optimal. How were you able to calculate your optimal parameters for your dataset and how would you recommend finding optimal parameters for any arbitrary dataset?

Best,
Thushan

Hi Thushan,

Yes, a grid search through the parameters is the most principled way of selecting the optimal values. However, as a tip, we've empirically found that TICC is very robust to the selection of lambda and the window size. Instead, the parameters that are "most important" are beta and the number of clusters. For practical purposes, I would recommend setting lambda and the window size to constants, and then doing a 2D grid search over #clusters and beta (which is a lot more manageable) to find your pseudo-optimal set.

Hope that helps!

Hi David,

Thanks for the reply. A quick follow up to the same question, till now we have been using BIC as a parameter to determine the most optimal solution as per the paper. But is there any other parameter(s) like BIC that we need to look for in order to get the most optimal solution. Or is there any other way to look at the same issue?
Reason being even if we set the maxIters = 1000, convergence doesn't occur (even till 999 iteration) as the values of clusters or beta increases. Thanks

Best,
Vignesh

You can also use Akaike Information Criteria (AIC) or cross-validation to solve it as well.

If it doesn't converge by then, you likely got stuck in a locally suboptimal solution (where every point is assigned to a single cluster). We use our code to try to "jump" out of those solutions, which is why it hasn't converged yet, but if it doesn't converge by then, that set of parameters is likely unsuitable for selection, so you should be able to skip them in your selection