braverock/FactorAnalytics

FA attached packages and packages loaded via a namespace

martinrd3D opened this issue Β· 21 comments

Among the current FA attached packages, I tentatively believe that rrcov and robustbase should not be needed, at least not needed as attached packages. But likely the RobStatTM should be an attached package because its lmrobdetMM() function, among other, are needed as robust regression options for fitTsfm and fitFfm. Consquently, Justin will delete them as attached packages and we will see what happens.

As for packages loaded via a namespace, there are currently 118, which seems way over the top. So we need to review which ones of these we really need, and pair the list down as much as possible.

it's relatively easy to look for instances of require or library in the code, and make those optional via Suggests, only if specific options are chosen by the user.

If those options are defaults, then the package should be listed in either Depends or Imports. The distinction between Depends and Imports is mostly about whether you believe that the package should also be available in the user namespace. e.g. xts and data.table probably need to be available to the user most of the time, so they probably belong in Depends, while RobStatTM may be required by our default arguments in FA, but the user isn't expected to know how to use the RobStatTM functions directly, so it likely belongs in Imports (if used in default FA function options) or Suggests (if the need for RobStatTM is not truly necessary for the default data/function flow for main functions in the package).

I have found the source of why fit.models, rrcov, robust and robustbase are all required.
As for fit.models it is because it is used by RobStatTM, along with Shiny, and we will remove that functionality from RobStatTM quite quickly now. This may also remove the rrcov dependency, but I will have to check on that.

One of the main reasons that robust and robustbase are required is because fitFfm and hence fitFfmDT compute robust covariance matrices with functions from those packages. So Mido, everywhere that a robust covariance matrix is used in fitFfm and fitFtmDT, the choice should be covRob from RobStatTM. FYI, covRob in RobStatTM automatically switches from using covRobMM to covRobRocke when the dimension is great or equal to 10 (which will typically be the case for the factor mode residuals, sometimes but infrequently for factor returns.

As for robustbase, at least one function scaleTau2 is from robustbase (don't ask my why, same for the above covariance matrix choices - just different people working it over time, and I couldn't keep track of everything they were doing, not using github). scaleTau2 is a robust scale estimate that we originally had in the robust package which we gave to the robustbase team, but I don't think we really need it, and can instead use scaleM from RobStatTM. I will check on this and add more comments to this issue tomorrow.
SUMMARY: (1) I will make sure that fit.models and Shiny go away, and (2) Mido, please do the robust covariance matrix replacements I mention above.

Hi @martinrd3D , I moved packages lmtest, strucchange, and HH to suggests in the method @braverock suggested. These include updating documentation to let users know they need to load these packages if they choose certain options. I also added additional examples to make it explicit.

Additionally, I removed MASS altogether, as its only used in a function we moved to the sandbox.

To answer your question, lmtest and strucchange, and HH were present for optional plotting approaches. If you look at batch of commits made prior to my previous message, you will see exactly where they were used.

yes @martinrd3D, this is going to be a monster issue to tackle. FactorAnalytics has 132 recursive dependencies at present. This needs to be reduced before pushing to CRAN or it will be tough to get published and/or maintain it.

See my recent commit above, 3c32e48, where I added FA-dependency-analysis.R to the sandbox. If you run it, it will add a pdf report in your sandbox and/or launch an HTML report in your browser.

These reports will help us identify the most wasteful packages. RobStatTM alone has about 77 recursive package dependencies, which you are working on. Due to the overlapping nature of packages, even if we did remove RobStatTM it won't necessarily reduce the FA dependencies by 77...but we need to address this.

Hi @martinrd3D,
Looks like we've settled on RobStatTM as the primary robust estimator package. We can start removing dependencies to others, if needed and moving RobStatTM to the SUGGESTS in the description file.

@martinrd3D I removed rrcov, lets discuss next time

@martinrd3D made progress on this issue tonight, and got most of the easy stuff. See related issue #60

Status update, the RobStatTM package, removed shiny and other dependencies for use in a separate package. This cut the packages recursive dependencies down to 14 from 77, a large improvement.

Next we shall move rugarch to SUGGESTS, as it contains 38 dependencies which gives us a single (yet very helpful option!) in the function fitFfm. This needs to be documented for users, with tests added, but can live in suggests.

Need to add an example to fitFfm of loading rugarch, and then running a model with the argument resid.scaleType = "GARCH" to show users how it works

btw.. what happened with fitffm wrapper idea?

@midoshammaa I still want to have a single fitFfm wrapper.

It would be more work, but I think we should also discuss adding generics for fit() and predict() which would take your idea from the data.table version of having a separate spec object.

From my understanding of the current workflow, this is something that we don't need to add now, i.e. adding it later wouldn't break backwards compatability with code that already works (something I am deeply sensitive to). Thoughts on this would be greatly appreciated.

Through general discussion and solving issues related to #32 #29 #31 #44 I think we can close this now