neurodata/ProgLearn

Remove duplicate imports from __init__.py

bstraus1 opened this issue · 6 comments

My issue is about ...
__init__.py imports {forest, network}.py which imports duplicates. This results in duplicate and missed imports.

Example of missed imports: ProgressiveLearner

Possible fixes for duplicates:

  • Specify specific packages to import (instead of *).
  • Create an "all variable" to track imported packages.

Possible fixes for missed packages:

  • Import from more than {network, forest}.py

Reproducing code example:

N/A

Version information

  • OS: macOS

@jdey4 Sambit asked that I tag you in this because it is a critical issue.

@bstraus1 @sampan501 Why would the missing ProgressiveLearner import be a problem? It's always wrapped inside ClassificationProgressiveLearner and should not be called directly, just like other base classes.

Perhaps @sampan501 can provide a better answer, but with regard to documentation, Sphinx cannot find it when generating the API reference page.

This is less a problem about documentation and more a problem of user access and potential circular inputs. I'm not sure if everything in those files need to be added, and defining the specific ones or at least defining an all variable is desirable. Any private functions in those files can be accessed by the user

@sampan501 In my knowledge, at least all the classes in base.py would not be directly imported by anyone. ProgressiveLearner is in a similar situation, but all functions and attributes are still implicitly accessible through inheritance. Thus hopefully, nothing in the current package is isolated or unused.

As for the problem of duplicate imports, I suppose the only way to prevent that is to combine all classes into one file. I personally do not consider it an issue at all.

I personally think the reference documentation should include mentions of helper classes since the people who will be using it will likely be adding functionality to the package. I think that scikit-learn does a good job of this. Otherwise, it is not clear presently how to add feature and contributions without doing an in-depth dive within the package.

The easiest thing for duplicate imports is what I mentioned before and creating an __all__ variable in the __init__.py that is simply a list of the names of the classes that you want the user to access. The problem with duplicate imports is that the package could act funny in the future and it may be difficult to diagnose why that is. Currently, if I import ProgLearn, I can import everything including base classes, but I think we want to just import Lifelong Classification Forest and LifelongClassificationNetwork since those are the only things currently documented. The "bad" I mentioned earlier is the possibility of circular imports, and that will break the package.