Migrating module dependencies to proper PIP packages
karlicoss opened this issue · 5 comments
Just figured I should document this.. At first, I tried to keep data access layers as minimal as possible (i.e. ideally in a single file), but it really seems to make more trouble than convenience:
- somewhat annoying to keep track of that in the config
- dependencies need to be installed separately via
requirements.txt
- non-transparent to static analysis, and to check it with mypy, still need a proper environment
With proper setup.py
- one can simply git
pip install <github repo url>
, without cloning anything in a temporary location - you can use virtualenv, if you prefer, to avoid mixing HPI dependencies and rest of the packages
- with
--editable
install, you can develop as if it was a symlink - and you still can manually symlink code if you prefer for some reason
Basically, the only downside is maintaining setup.py
. I keep it very minimal, merely with the package name, py.typed
file for MyPy and the dependencies, since I'm not planning to upload to PIP (and no one really looks at the classifiers/reads documentation on PyPi anyway).
UPD: also it seems that for proper mypy integration it's necessary to have __init__.py
(see my comment here https://github.com/karlicoss/endoexport/blob/be5084aa45aaac206ff86624244f40c08b439340/src/endoexport/__init__.py#L1-L5 ). If anyone knows how to get around this, please let me know!
Related discussions:
Migrated:
- https://github.com/karlicoss/emfitexport
- https://github.com/karlicoss/endoexport
- https://github.com/karlicoss/rescuexport
- @seanbreckenridge started with PIP package structure straightaway, for example: https://github.com/seanbreckenridge/pushshift_comment_export
Tagging @seanbreckenridge as you were interested in that too, let me know if you have some thoughts!
great!
I don't think you were planning to, but since these aren't being pushed to pypi
, I don't think any of the modules should be automatically installed in HPI
's setup.py
(Installing from git using setuptools
is a bit annoying in any case), despite that being a possibility now.
Similar to the comment here (If you only have few modules set up, lots of them will error for you, which is expected, so check the ones you expect to work.
), if you haven't run pip install git+https://github.com/karlicoss/...
for rexport
, you shouldn't expect the module (my.reddit
) to work 'out of the box'.
Some of the docs here
will probably have to be modified, to clarify that modules (e.g. my.<module>
) can either be:
- just individual
py
files parsing data exports from other applications (likemy.smscalls
) - private modules specified as ordinary symlinks in repos directory (like
~/.config/my/my/config/repos/endoexport
) (though I think the only reason someone would do this is if they wanted a private module, but didn't want to go through the effort of writing the minimalsetup.py
file) - packages hosted on your/someone elses github installed using setuptools
- (at least for now) modules like
ghexport
/rexport
currently, which useimportlib
I don't think it'd be common for someone to use importlib
for their own private modules, so once ghexport
/rexport
are converted, that way (and the corresponding pattern for describing the cloned location as an attr in my.config
) could be removed.
In any case, I think its good that all these choices exist, but adding a minimal setup.py
is nice for the reasons you mentioned above.
I don't think any of the modules should be automatically installed in HPI's setup.py
Yep, absolutely! I might add them as extras_require
at first, merely for convenience (e.g. like in promnesia), but ultimately I feel like the modules should "declare" which extra pip packages they need (+ this would allow for nicer hpi doctor
integration as discussed).
And also considering the plan is to figure out the core eventually and split out the third party modules, better not to pollute the default dependencies anyway!
Private modules specified as ordinary symlinks in repos directory
I think I'll just get rid of this for the sake of simplicity (+ some backwards compatibility for those who did set it up like that already). If someone wants it, they can always symlink directly into their virtualenv, or user package directory, or pass via PYTHONPATH
-- lots of ways!
I think I'll just get rid of this for the sake of simplicity
they can always symlink directly into their virtualenv
Yeah, if I had to pick, this was the one I was leaning towards removing as well. Once rexport
/ghexport
exist, one can always use that as a template for their own personal setup.py
as well. Perhaps you could create and link to a cookiecutter template (like the one I use for small libraries) for creating a 'hpi setuptools module', as thats the direction youre moving towards for modules. That could include some instructions on how to install a personal module that someone creates for themselves using setuptools into their global/virtualenv environment.
All right, I converted a few more modules:
- karlicoss/instapexport#3
- karlicoss/rexport#11
- karlicoss/pockexport#4
- karlicoss/ghexport#5
- karlicoss/hypexport#6
, and added HPI support with a backwards compatible fallback #83
I've tested for a bit and seems fine, but I'll leave it for a couple of days just to be more sure, and then update the docs, and merge everything in one go.
Right! So, I added a thing to HPI that can parse the requirements (via ast
module, so there won't be any import errors or something). So now it's possible to use something like hpi module install my.endomondo
, and it will install the required dependencies. Or run hpi module requires my.endomondo
which will dump the dependencies in stdout (in case the user has some custom install process for dependencies). Seems like a reasonable compromise without forcing any special plugin architecture, guess okay to close this now.