tomlof/OnPLS

Some questions

Closed this issue · 10 comments

Hi, Tommy,
(1) I used the example in resampling.grid_search() by Python 3.5.2:


np.random.seed(42)
n, p_1, p_2, p_3 = 4, 3, 4, 5
t = np.sort(np.random.randn(n, 1), axis=0)
p1 = np.sort(np.random.randn(p_1, 1), axis=0)
p2 = np.sort(np.random.randn(p_2, 1), axis=0)
p3 = np.sort(np.random.randn(p_3, 1), axis=0)
X1 = np.dot(t, p1.T) + 0.1 * np.random.randn(n, p_1)
X2 = np.dot(t, p2.T) + 0.1 * np.random.randn(n, p_2)
X3 = np.dot(t, p3.T) + 0.1 * np.random.randn(n, p_3)
X = [X1, X2, X3]

predComp = [[0, 1, 1], [1, 0, 1], [1, 1, 0]]
orthComp = [1, 1, 1]
onpls = OnPLS.estimators.OnPLS(predComp, orthComp)

params_grid = OnPLS.utils.list_product([0, 0, 0], [3, 3, 3])
OnPLS.resampling.grid_search(onpls, X,{"orthComp": params_grid})

it gave the following error:

Traceback (most recent call last):

  File "<ipython-input-12-6849f3f121c7>", line 17, in <module>
    OnPLS.resampling.grid_search(onpls, X,{"orthComp": params_grid})

  File "E:/Hai Windows/work/softwares/python_package/OnPLS-master\OnPLS\resampling.py", line 181, in grid_search
    name = names[i]

TypeError: 'dict_keys' object does not support indexing

(2) The same example in (1) works by Python 2.7.11, it gave the result below:
(<OnPLS.estimators.OnPLS at 0xa489128>, 0.88410078722573182, {'orthComp': [2, 2, 1]})

I was wondering how to tuning predComp and orthComp together by resampling.grid_search() ?

Thanks.

Great catch, thank you!

I've pushed a fix for this bug. In Python >= 3.3, dict.keys returns a dict_keys instead of a list. Let me know if it works for you now, and I'll close this issue.

How about the second question? How to tuning the predComp via resampling.grid_search() together with tuning the orthComp? It returned errors when I was tuning the predComp.

Sorry, I missed the question. I have added a second example to the doctest of resampling.grid_search that gives an example of how to search for predComp. Essentially, you just pass another dictionary entry in the third argument to grid_search, but for predComp, you must be careful to adhere to some constraints, in particular that each block is connected to at least one other block.

Finding good values of predComp is much more complicated than finding good values for orthComp, though, so be careful.

I will try to find the time to add more informative error messages for this purpose in the near future. The problem are these constraints, and the package can warn or print error messages directly when such problems are detected, instead of crashing without any information.

Hi, Tommy,
Does the function resampling.cross_validation() allow the number of non-global compnents for one block data is zero, e.g., orthComp=[0,1]?
It gave me the following error when I used predComp = [[0, 2], [2, 0]] and orthComp = [0, 1]:

File "realdata_analyze_OnPLS_below90.py", line 45, in <module>
   cv_scores2 = OnPLS.resampling.cross_validation(onpls, [Y_1, Y_2], cv_rounds=5,random_state=0)
 File "/hshu/OnPLS/resampling.py", line 95, in cross_valid$
   score = estimator.score(Xtest)
 File "/hshu/OnPLS/estimators.py", line 1125, in score
   Xhat = self.predict(X)
 File "/hshu/OnPLS/estimators.py", line 1094, in predict
   Xhatw = Xhatw + np.dot(Thatwk, Pw.T)
ValueError: shapes (132,1) and (2,1195) not aligned: 1 (dim 1) != 2 (dim 0)

I've taken a first glance at this, but have not found the problem.

I can tell you right away, though, that it is not because of the zero number of orthogonal components there is an error.

The problem is with the joint variation, though I don't see what's the problem here. I would benefit from having your data, but I guess that is not realistic ;-)

Maybe if you tell me the shapes and ranks of your Y_1 and Y_2, I'll be able to make some more tests.

I found a tiny possible bug that I've corrected and pushed to the repo. You could pull the latest and see i it makes any difference for you.

In lines 597 and 1095 of estimators.py, should it be Xhatw = Xhatw + np.dot(Thatwk, Pw[:,[k]].T)? See line 569. When I did such a change, the error disappeared.

I've found and corrected a major bug related to having multiple joint components. Please have a look and see if the latest commit works for you.

It works now. Thank you.

Perfect! Great that we could solve that bug!