brandmaier/semtree

Different cur.type for metric and ordinal variables

manuelarnold opened this issue · 8 comments

Currently, the cur.type is 1 for categorical variables and 2 for metric and ordinal variables. Since the distinction between ordinal and metric variables is important for both maxLR test statistics and score-based tests, it would make sense to use different cur.type values for both types of variables.
1: categorical
2: ordinal
3: metric

This is work-in-progress now.

Please see versions from 6e01466 and above. We now have pseudo-constants that can be returned to define scale of measurement. Please return the respective types from the score tests back to growTree(). The constants are defined in semtree-package.R as:

.SCALE_METRIC = 2
.SCALE_ORDINAL = 3
.SCALE_CATEGORICAL = 1

semtree now properly handles unordered and ordered factors but these changes broke score-tests for ordinal variables. I identified one possible problem in your code (2d813e8) but the score test still fails. Let me know what you need to know to fix this, @manuelarnold .

I tried to fix the issue in d7b1247. I hope this is all that is needed. Please confirm.

@manuelarnold, could you please confirm that this is OK and then close the issue?

There are some new changes related to this topic that we could discuss here:
In my fork, I also distinguish between dummy (categorical variables with two levels) and multinomial variables (categorical variables with more than 2 levels). So, I would be in favor of separating nSCALE_CATEGORICAL into .SCALE_MULTINOMIAL and .SCALE_DUMMY.
By the way, score-based testing of multinomial variables is now fully score-based and should be faster than the testing in the main branch.

@manuelarnold, how should we proceed with these changes? Would you want to prepare a pull request, so that I can check your proposed changes?

I think these changes are already in the main branch. I will try to solve some conflicts in the next weeks and then we can start the process of synching the branches.