MovingPandas: Software Submission for Review
anitagraser opened this issue · 83 comments
Submitting Author: Anita Graser (@anitagraser)
All current maintainers: Anita Graser (@anitagraser)
Package Name: MovingPandas
One-Line Description of Package: Trajectory classes and functions built on top of GeoPandas
Repository Link: https://github.com/movingpandas/movingpandas
Version submitted: 0.2
Editor: Jenny Palomino (@jlpalomino)
Reviewer 1: Ivan Ogasawara (@xmnlab)
Reviewer 2: Martin Fleischmann (@martinfleis)
Archive:
JOSS DOI: N/A
Version accepted: v 0.3.rc1
Date accepted (month/day/year): 03/19/2020
Description
- Include a brief paragraph describing what your package does:
MovingPandas is a package for dealing with movement data. MovingPandas implements a Trajectory class and corresponding methods based on GeoPandas. A trajectory has a time-ordered series of point geometries. These points and associated attributes are stored in a GeoDataFrame. MovingPandas implements spatial and temporal data access and analysis functions (covered in the open access publication [0]) as well as plotting functions.
A usage example is available at http://exploration.movingpandas.org,
[0] Graser, A. (2019). MovingPandas: Efficient Structures for Movement Data in Python. GI_Forum ‒ Journal of Geographic Information Science 2019, 1-2019, 54-68. doi:10.1553/giscience2019_01_s54. URL: https://www.austriaca.at/rootcollection?arp=0x003aba2b
Scope
- Please indicate which category or categories this package falls under:
- Data retrieval
- Data extraction
- Data munging
- Data deposition
- Reproducibility
- Geospatial
- Education
- Data visualization*
* Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see this section of our guidebook.
- Explain how the and why the package falls under these categories (briefly, 1-2 sentences):
Geospatial (primary): The MovingPandas Trajectory class implements is a spatio-temporal data model for movement data.
Data visualization (secondary): The implemented plot functions enable straight-forward movement data exploration that goes beyond plotting the individual point locations by ensuring that trajectories are represented by linear segments between consecutive points.
- Who is the target audience and what are scientific applications of this package?
Movement data / trajectories appear in many different scientific domains, including physics, biology, ecology, chemistry, transport and logistics, astrophysics, remote sensing, and more.
For example, the provided tutorials cover the analysis of migrating birds as well as the analysis of ship movement within a port.
- Are there other Python packages that accomplish the same thing? If so, how does yours differ?
scikit-mobility is a similar package which is also in an early development stage and also deals with movement data. They implement TrajectoryDataFrames and FlowDataFrames on top of Pandas instead of GeoPandas. There is little overlap in the covered use cases and implemented functionality (comparing MovingPandas tutorials and scikit-mobility tutorials). MovingPandas focuses on spatio-temporal data exploration with corresponding functions for data manipulation and analysis. scikit-mobility on the other hand focuses on computing human mobility metrics, generating synthetic trajectories and assessing privacy risks.
- If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or
@tagthe editor you contacted:
Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
- does not violate the Terms of Service of any service it interacts with.
- has an OSI approved license
- contains a README with instructions for installing the development version.
- includes documentation with examples for all functions.
- contains a vignette (notebook) with examples of its essential functions and uses.
- has a test suite.
- has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.
Publication options
- Do you wish to automatically submit to the Journal of Open Source Software? If so:
JOSS Checks
- The package has an obvious research application according to JOSS's definition in their submission requirements. Be aware that completing the pyOpenSci review process does not guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS.
- The package is not a "minor utility" as defined by JOSS's submission requirements: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria.
- The package contains a
paper.mdmatching JOSS's requirements with a high-level description in the package root or ininst/. - The package is deposited in a long-term repository with the DOI:
Note: Do not submit your package separately to JOSS
Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
- Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.
Code of conduct
- I agree to abide by pyOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.
P.S. Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
hi @anitagraser !! thank you again for this submission. it will be on our discussion list for this thursday's pyopensci meeting! can you think of any folks who might be well suited to review this package? we will need 2 people.
Thank you @lwasser! I think GeoPandas developers would be a good fit.
@jlpalomino will be the fearless editor for this submission !! And @xmnlab will be our first reviewer. We will reach out to the geopandas folks. @martinfleis would you be interested in being a second reviewer for moving pandas? please let us know!
@lwasser I would love to do that, but not sure how fast I'd be. What is the timeframe?
hey @martinfleis we understand. we typically ask for a 3 week turn around on reviews. Would that timeframe work for you or is that too quick? Many thanks for responding so quickly!
@lwasser that seems to be doable. Count me in.
awesome!! Thanks @martinfleis for doing this!!
Editor checks:
- Fit: The package meets criteria for fit and overlap.
- Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
- License: The package has an OSI accepted license
- Repository: The repository link resolves correctly
- Archive (JOSS only, may be post-review): The repository DOI resolves correctly
- Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?
Editor comments
Thanks @xmnlab and @martinfleis for agreeing to review MovingPandas. Please use the following resources to submit your review:
The submitting author is open to receiving issues and PRs if you want to create a review using that approach (e.g. include links to the issue and/or PR in your review).
Feel free to reach out with any questions about the review process.
Reviewers: Ivan Ogasawara (@xmnlab) and Martin Fleischmann (@martinfleis)
Due date: February 7th, 2020
Package Review
- As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).
Documentation
The package includes all the following forms of documentation:
-
A statement of need clearly stating problems the software is designed to solve and its target audience in README
-
Installation instructions: for the development version of package and any non-standard dependencies in README
- See the discussion in movingpandas/movingpandas#49.
-
Vignette(s) demonstrating major functionality that runs successfully locally
-
Function Documentation: for all user-facing functions
- The documentation is not ideal as there are often missing explanations of options for string inputs of available methods, like in
Trajectory.get_position_at()orTrajectory.generalize(). I would personally also prefer having one place with all documentation sources, or at least links between them. Some parts are in Readme, some in examples and other on RTD.
- The documentation is not ideal as there are often missing explanations of options for string inputs of available methods, like in
-
Examples for all user-facing functions
-
The repository contains extraordinary Jupyter Notebooks working as a user guide and new are being created. All run locally as well as on mybinder if one wants to play with the data quickly. However, there is no link to all the examples apart from my binder badge. One has to find them in the repository.
-
Minor typo in
3_horse_collar.ipynb: ha is 10 000 not 1 000 m (which is correct at another place).:total_area = total_area[collar_id]/1000
-
-
Community guidelines including contribution guidelines in the README or CONTRIBUTING.
Contribution guidelines seem to be missing. I haven't found them anywhere.
-
Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a
setup.pyfile or elsewhere.
Readme requirements
The package meets the readme requirements below:
- Package has a README.md file in the root directory.
The README should include, from top to bottom:
-
The package name
-
Badges for continuous integration and test coverage, the badge for pyOpenSci peer-review once it has started (see below), a repostatus.org badge, and any other badges. If the README has many more badges, you might want to consider using a table for badges, see this example, that one and that one. Such a table should be more wide than high.
Code coverage is missing, pyOpenSci peer-review is missing, repostatus.org badge is missing.
-
Short description of goals of package, with descriptive links to all vignettes (rendered, i.e. readable, cf the documentation website section) unless the package is small and there’s only one vignette repeating the README.
-
Installation instructions
-
Any additional setup required (authentication tokens, etc)
-
Brief demonstration usage
-
Direction to more detailed documentation (e.g. your documentation files or website).
-
If applicable, how the package compares to other similar packages and/or how it relates to other packages
-
Citation information
There are references to related papers, but it is not clear how should bemovingpandasitself cited.
Functionality
-
Installation: Installation succeeds as documented.
-
Functionality: Any functional claims of the software been confirmed.
-
Performance: Any performance claims of the software been confirmed.
-
Automated tests: Tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
- I'd say that the essential functions are covered, but the overall coverage could be higher. E.g.
movingpandas/trajectory_collection.pyis covered only from 63%. I am aware thatmovingpandas/trajectory_aggregator.pyis new so I assume that its tests are still TBD.
- I'd say that the essential functions are covered, but the overall coverage could be higher. E.g.
Name Stmts Miss Cover
----------------------------------------------------------------------
movingpandas/__init__.py 6 0 100%
movingpandas/geometry_utils.py 45 4 91%
movingpandas/overlay.py 152 12 92%
movingpandas/tests/__init__.py 0 0 100%
movingpandas/tests/test_geometry_utils.py 41 0 100%
movingpandas/tests/test_overlay.py 78 0 100%
movingpandas/tests/test_trajectory.py 208 0 100%
movingpandas/tests/test_trajectory_collection.py 54 0 100%
movingpandas/trajectory.py 298 29 90%
movingpandas/trajectory_aggregator.py 229 192 16%
movingpandas/trajectory_collection.py 113 42 63%
movingpandas/trajectory_plotter.py 82 13 84%
----------------------------------------------------------------------
TOTAL 1306 292 78%
-
Continuous Integration: Has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.
- I would recommend adding one testing environment with dev versions of key packages (geopandas, shapely) to check for the potential issues with new versions soon enough to fix them. Testing under Windows (e.g. AppVeyor) could also be helpful as there are differences in the behaviour of Python geospatial stack between OS.
-
Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.
For packages co-submitting to JOSS
- The package has an obvious research application according to JOSS's definition in their submission requirements.
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md matching JOSS's requirements with:
- A short summary describing the high-level functionality of the software
- Authors: A list of authors with their affiliations
- A statement of need clearly stating problems the software is designed to solve and its target audience.
- References: with DOIs for all those that have one (e.g. papers, datasets, software).
Final approval (post-review)
- The author has responded to my review and made changes to my satisfaction. I recommend approving this package.
Estimated hours spent reviewing: 5
Review Comments
MovingPandas is a valuable addition to python geospatial stack. Being built on top of GeoPandas GeoDataFrames, its main classes are easy to understand, and the whole work with MovingPandas is very natural and straightforward. I had almost no issues in using it with my data, and everything works as advertised.
Initially, I was a bit confused by released versions of MovingPandas as when I started there was no release on GitHub and PyPI had different version than conda-forge. I would recommend following JOSS recommendation here and trying to keep these 3 (GitHub, PyPI, conda-forge) in sync as GitHub releases automatically send a notification to watching users.
During the review process, I have opened a couple of issues/PRs in the original repository, all linked above this post.
I am excited to see the further development of it as the latest addition of trajectory aggregator looks brilliant. I will certainly follow new releases, and once I have to work with movement data, MovingPandas will be the first choice.
Thanks a lot for the thorough review and great feedback, Martin! I'll work on the open issues.
I've been looking for the badge for pyOpenSci peer-review but haven't been able to locate one for ongoing peer review.
I've been looking for the badge for pyOpenSci peer-review but haven't been able to locate one for ongoing peer review.
That is more the question for @lwasser and @jlpalomino, I just copied review template.
Thanks @martinfleis for your review.
@anitagraser I also was not able to find the badge details in our review guide, so I have made a note to look into where we provide this info.
Here is the badge for pyOpenSci peer review, with the second link being the URL to this issue:
[](https://github.com/pyOpenSci/software-review/issues/18)
If fixed the remaining README issues: badges movingpandas/movingpandas#53 and citation information movingpandas/movingpandas@56ef608
The last open issue from Martin's review should be the Contribution guidelines.
I am planning to review MovingPandas today :)
Contribution guidelines are now available at https://github.com/anitagraser/movingpandas/blob/master/CONTRIBUTING.md
@anitagraser when your package has fully passed both reviews and both reviewers are happy with your addressing their requested changes, we will ask you to add the badge to the readme!! please get in touch with any other questions. @martinfleis THANK YOU for this review!!
one other question @anitagraser are you interested in JOSS? i see you didn't check the box. Joss only requires you to write a very short paper about the package (i can show you the earthpy example) . They accept the pyopensci technical review by default. no worries if you are not interested... but it's a nice citation to have if you are (linked to your orcid id and such).
@lwasser Thank you. Do I understand correctly that I should remove the pyopensci badge from the MovingPandas readme again? Is there a different badge to fulfill the requirement in the review template "the badge for pyOpenSci peer-review once it has started"?
Concerning JOSS, I have been thinking about it but wasn't sure if JOSS sees prior publications as an obstacle:
Graser, A. (2019). MovingPandas: Efficient Structures for Movement Data in Python. GI_Forum ‒ Journal of Geographic Information Science 2019, 1-2019, 54-68. doi:10.1553/giscience2019_01_s54. URL: https://www.austriaca.at/rootcollection?arp=0x003aba2b
@lwasser After looking at the earthpy paper you mentioned, I think there should be minimal overlap with the existing MovingPandas paper in GI_Forum. So yes, I'd like to try a JOSS submission.
Work in progress: https://github.com/anitagraser/movingpandas/tree/joss-paper
I will finish my review today :)
@anitagraser that's great to hear!! ok so the process is pretty straight forward. Once the review is done and moving pandas is accepted here, I will ping our JOSS friends and they will help you through the JOSS process. this all looks great. the only challenge i had with JOSS was small formatting issues when it published to PDF. There is a nice tool however that is helpful. so when we get there ... I can help!
Also you are the second package maintainer that has asked about other publications. i will email arfon tonight!! Also let me look into the badge specifics. it's not a huge deal but typically we've had people add the badge post review. I'll get back to you anita!!
ok @anitagraser @arfon actually responded here: #16 (comment) on another review. he did say that if you have another citation already that you want to use to cite your software, then you probably don't need to worry about joss. i know you have already begun work on the joss paper... but see his comment and let us know what you think! thank you for your patience. we are still working through some of the details of our review process in these early stages
Package Review
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
- As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).
Documentation
The package includes all the following forms of documentation:
-
A statement of need clearly stating problems the software is designed to solve and its target audience in README
-
Installation instructions: for the development version of package and any non-standard dependencies in README
- Minor issue: For the development installation, comment also how to install
movingpandasfor development, e.g.:python setup.py developorpip install --no-deps -e .
- Minor issue: For the development installation, comment also how to install
-
Vignette(s) demonstrating major functionality that runs successfully locally
- it uses jupyter notebooks.
- in the tutorial 0, maybe "Option 2: Creating trajectories with TrajectoryManager" should be "Option 2: Creating trajectories with TrajectoryCollection"
-
Function Documentation: for all user-facing functions
- There are some public functions/methods that don't have docstring. For example:
- https://github.com/anitagraser/movingpandas/blob/fb97166309d8bd7b6f578f060a50e691c1b65cc0/movingpandas/trajectory_collection.py#L127
- https://github.com/anitagraser/movingpandas/blob/fb97166309d8bd7b6f578f060a50e691c1b65cc0/movingpandas/trajectory_plotter.py#L65
- https://github.com/anitagraser/movingpandas/blob/fb97166309d8bd7b6f578f060a50e691c1b65cc0/movingpandas/trajectory_plotter.py#L73
- https://github.com/anitagraser/movingpandas/blob/fb97166309d8bd7b6f578f060a50e691c1b65cc0/movingpandas/trajectory_plotter.py#L85
- https://github.com/anitagraser/movingpandas/blob/fb97166309d8bd7b6f578f060a50e691c1b65cc0/movingpandas/trajectory_plotter.py#L96
You can use some tool to check the docstrings, e.g.: pydocstyle ( pydocstyle --convention=numpy ) or maybe numpydoc (https://numpydoc.readthedocs.io/en/latest/validation.html).
- There are some public functions/methods that don't have docstring. For example:
-
Examples for all user-facing functionsExample section inside docstring function is a very good way to explain how to use it, also it will be available inside the documentation web-page. Currently, all the examples provided are inside the tutorials.
-
Community guidelines including contribution guidelines in the README or CONTRIBUTING.
-
Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a
setup.pyfile or elsewhere.
Readme requirements
The package meets the readme requirements below:
- Package has a README.md file in the root directory.
The README should include, from top to bottom:
- The package name
- Badges for continuous integration and test coverage, the badge for pyOpenSci peer-review once it has started (see below), a repostatus.org badge, and any other badges. If the README has many more badges, you might want to consider using a table for badges, see this example, that one and that one. Such a table should be more wide than high.
- Short description of goals of package, with descriptive links to all vignettes (rendered, i.e. readable, cf the documentation website section) unless the package is small and there’s only one vignette repeating the README.
- Installation instructions
- Any additional setup required (authentication tokens, etc)
- Brief demonstration usage
- Direction to more detailed documentation (e.g. your documentation files or website).
- If applicable, how the package compares to other similar packages and/or how it relates to other packages
- Not found in the README file.
- Citation information
Functionality
- Installation: Installation succeeds as documented.
pip install movingpandasdidn't work
- Functionality: Any functional claims of the software been confirmed.
- Performance: Any performance claims of the software been confirmed.
- Automated tests: Tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
- Continuous Integration: Has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.
- Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.
For packages co-submitting to JOSS
- The package has an obvious research application according to JOSS's definition in their submission requirements.
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md matching JOSS's requirements with:
- A short summary describing the high-level functionality of the software
- Authors: A list of authors with their affiliations
- A statement of need clearly stating problems the software is designed to solve and its target audience.
- References: with DOIs for all those that have one (e.g. papers, datasets, software).
Final approval (post-review)
- The author has responded to my review and made changes to my satisfaction. I recommend approving this package.
Estimated hours spent reviewing: 5h30
Review Comments
Review based on the version with the latest commit fb97166
- In the README file, maybe it would be good if
[0]and[1]would be linked to thePublicationssection. - in the "Tutorial 0", section "Continue exploring MovingPandas", maybe it would be better to have just a link to the tutorial (1).
- in the "Tutorial 0", probably "Option 2: Creating trajectories with TrajectoryManager" should be "Option 2: Creating trajectories with TrajectoryCollection"
- the "Tutorial 3", doesn't have a section "Continue exploring MovingPandas".
- the installation using pip failed.
- explain how to install
movingpandasin development mode. -
add missing docstrings for user-facing functions/methods. -
addExamplessection in the user-facing functions/methods docstrings. - add into README file how
movingpandascompares to other similar packages - the title of the tutorials 3 and 4 doesn't match to the name provided in the tutorial list in the "Tutorial 0"
- check why
pip install movingpandasdidn't work
Suggestions (optional):
- on CI, check if the code fit in the PEP 8 style (for example, using flake8).
- it would be good to have all tutorials statically inside the documentation page (eg. https://pandas.pydata.org/pandas-docs/stable/getting_started/basics.html, https://docs.ibis-project.org/tutorial.html)
- also each static tutorial page could have a link to mybinder (eg. https://sunpy.org/posts/2018/2018-07-21-coronal-loop-coordinates)
- as it is still common to have Python 3.6 environment and it is not supported by
movingpandas, it would be good to comment in the documenation the Python versions supported. - CI maybe could cover the MacOS and MS Windows. Also it could cover all Python version supported.
- in the tutorials, add to the charts a title, a description for each axis and its respectively unit and legends.
- in the "Tutorial 4", add the description about this notebook.
PS/1: Examples for all user-facing functions is not required anymore.
PS/2: new note: that it is NOT recommended to install MovingPandas from PyPI
@anitagraser it is a interesting project!
I have added the result of my review above. Let me know if you need any additional explanation or detail!
Thank you very much for the extensive feedback @xmnlab! It's very clear. I have just one question so far concerning "explain how to install movingpandas in development mode.": the README already has a section titled Development Installation with the necessary conda instructions. Are you missing something specific?
Status of improvements:
- In the README file, maybe it would be good if [0] and [1] would be linked to the Publications section. movingpandas/movingpandas@f1da795
- in the "Tutorial 0", section "Continue exploring MovingPandas", maybe it would be better to have just a link to the tutorial (1). movingpandas/movingpandas@1b1ee2e
- in the "Tutorial 0", probably "Option 2: Creating trajectories with TrajectoryManager" should be "Option 2: Creating trajectories with TrajectoryCollection" movingpandas/movingpandas@f3b44ad
- the "Tutorial 3", doesn't have a section "Continue exploring MovingPandas". movingpandas/movingpandas@1b1ee2e
- the installation using pip failed. movingpandas/movingpandas@eb33d4d
- explain how to install movingpandas in development mode. movingpandas/movingpandas@9128ea0
- add missing docstrings for user-facing functions/methods. movingpandas/movingpandas@a0607d1
- add Examples section in the user-facing functions/methods docstrings.
- add into README file how movingpandas compares to other similar packages movingpandas/movingpandas@1f58c11
- the title of the tutorials 3 and 4 doesn't match to the name provided in the tutorial list in the "Tutorial 0" movingpandas/movingpandas@1b1ee2e
- check why pip install movingpandas didn't work movingpandas/movingpandas@eb33d4d
@lwasser I've removed the review badge for now (after reading the last pyOpenSci meeting notes). Concerning JOSS: I think it would still be worthwhile to publish in JOSS since the current main paper is in a very specific venue with a very different audience.
@anitagraser thanks for checking in about this and for your action! We are now working on a PR to the dev guide (pyOpenSci/software-peer-review#42) to address the timing of when to add the badge as well as to include the badge information for when the package is accepted.
@anitagraser about the "Development Installation" it explains how to prepare the environment but I didn't find the command line for install movingpandas in development mode, it could be, for example, python setup.py develop or pip install --no-deps -e . or if you want to just point to some nice documentation/tutorial that explains more about development mode you can add just a link to that page (eg.: https://python-packaging-tutorial.readthedocs.io/en/latest/setup_py.html#develop-mode) or also you can do both
@xmnlab Thank you! I wasn't aware of this develop mode. That's what caused the confusion, since my dev setup so far is finished when the environment is ready.
I've added usage examples to all user-facing functions except those that are self-explanatory (because they have either no parameters or have only simple parameters that are sufficiently described in the corresponding parameters section). I've also added direct links to the tutorials to https://movingpandas.readthedocs.io/en/latest/index.html
@anitagraser it looks very nice!
related to your point about the methods self-explanatory, I would add examples for each function/method including for these cases because:
- if you want you can also integrate it with doctest (https://docs.python.org/3/library/doctest.html) in the future and it will help to test your docstring examples and
- for some functions that return more complex data (like dataframe or series) it helps to understand better the data returned (also maybe it would be good to add more details about the data structure (eg: columns names and data types) for functions that return dataframes in the
Returnssection.
@lwasser do you have any thoughts about this point?
actually the editor for this submission is @jlpalomino. do you have any thoughts about this?
If I may add my thoughts here, I would say that there should be a distinction in the pyOpenSci review process between what is considered required change and what is a recommendation only. For example, I did not point out examples in docstrings as I did not feel it is a substantial issue. @xmnlab has a different approach, requiring complete examples ready for doctest.
This is a more general comment, so this issue might not be the right place for such discussion, but I believe that pyOpenSci needs to find a balance in the review process and maybe set some standards (maybe rOpenSci already deals with this?). The review should be valuable and thorough, but not onerous. I know it is hard to find the balance, especially if the submitted package would be of low quality (which is definitely not this case), but we are still in the open source world and we should acknowledge the nature of (largely free) work.
I believe that all the issues @xmnlab are valuable and will help @anitagraser. But it is a bit unclear to me if pyOpenSci review require this detail or if they should actually be recommendations, which are not in a way of publishing MovingPandas on pyOpenSci.
It may sounds as critics of @xmnlab's review, but it is not meant to be one :). I am genuinely curious about the expected level of the review as this is my first experience with pyOpenSci.
@martinfleis actually I am very curious about this too :) as the guideline says that all facing-user function should have examples (https://www.pyopensci.org/dev_guide/packaging/packaging_guide.html#Documentation) so using just tutorials for the examples is hard to check if all functions have an example inside. also from the user's perspective, If I want to see how to use a method, generally I will go to the API documentation not the tutorials.
examples inside the docstring will be published automatically inside the documentation (if the project generate the documentation from the code).
maybe I didn't express me well about the doctest ... I didn't want to suggest to use that .. I just want to comment that also it could help in the future .. but it is not required. the main point about the examples was the item 2.
But as required by https://www.pyopensci.org/dev_guide/packaging/packaging_guide.html#Documentation my first guess is that all public functions should have an example inside the docstrings or on the API documentation.
@jlpalomino could you guide us about this topic?
@xmnlab good point. I thank that after reading Good/Better/Best in this section I interpreted it as if it is covered in tutorial it is fine. But you are probably right.
Thank you all for working on clarifying this! I mostly looked at previously accepted projects to guide my work. For example, Pandera has usage examples but not for all user-facing functions: https://pandera.readthedocs.io/en/latest/API.html
@anitagraser @martinfleis @xmnlab This is a great discussion, and I want to encourage you to continue it. I am interjecting here to say that I will discuss these comments further with @lwasser and get back to you with feedback from the editor perspective.
When I started my review I tried to understand what pyOpenSci expects. Based on the review guide, I felt that it tries to be rather inclusive than strict. After reading various Good/Better/Best explanations, my understanding was following:
- The package does not have to be perfect. Various Good options are far from perfect (e.g.
Manually updated documentation as text files that ship with your package.), but they are good enough to send the package to the world. - As a reviewer, I should try to find issues. Those which normal user might face and those developer might face. If they are significant, they should be fixed. If they are not significant or will become significant in a year (deprecation etc.), it is fine to point them out and leave developer to fix them later.
- I should make sure that every aspect of the package is at least at a Good level. Then I can make recommendations how to get to Better or Best level, but these, again, should not be in a way of acceptance.
- If everything is at least at a Good level and I gave my recommendations how to make it even better, I should approve the package for pyOpenSci publication.
These were (roughly) the rules I applied to my review. I would be happy to hear from @lwasser and @jlpalomino if they fit the aim of pyOpenSci. How strict do you want to be in the review?
Hi @anitagraser @martinfleis @xmnlab thanks for your patience in this process.
@lwasser and I have decided that this is a good topic of conversation during the next pyOpenSci meeting, which will occur next Fri 2/28/20 at 11am MDT (details on discourse). We invite you all to attend if you are able.
Hi @anitagraser @martinfleis @xmnlab - an update that I have posted on discourse ahead of the community meeting tomorrow at 11am MDT.
You have all brought up great points, and as the editor, I am looking forward to getting input from the community regarding "Good, "Better", "Best" and recommended vs required changes.
hi @anitagraser @martinfleis @xmnlab i have added some language to the post based upon our meeting today. how much documentation is good enough? https://pyopensci.discourse.group/t/good-better-best-required-versus-recommended-changes-by-reviewers-your-feedback-is-requested/160/2 same link as the one jenny provided above but with a new post from our community call today. Do you guys have any thoughts on this? i really appreciate the feedback as we define what our standards are for the community!
Thank you for the update, @lwasser ! If I understand the intent of the discussion on discourse, having examples for all user-facing functions is not going to be a must.
Besides trying to find the best wording for the guidelines, how should we proceed on this review?
If I haven't missed anything, my work on the requested improvements is finished.
@anitagraser what about JOSS submission? Is it happening in the end? If you want to submit to JOSS we’ll have to review the paper and tick couple of other boxes.
@martinfleis Right! For JOSS, I'll still have to get a DOI.
I might bump the MovingPandas version to 0.3. (Movingpandas 0.2 works for geopandas 0.6.3. MovingPandas 0.3 works with GeoPandas 0.7)
The paper is in a dedicated brach for now: https://github.com/anitagraser/movingpandas/blob/joss-paper/paper.md
Is there anything else?
I've checked the paper and there does not seem to be any issue re JOSS requirements. Just few very minor notes.
- matplotlib reference is missing
- after MyBinder link is missing dot
- JOSS apparently makes a new line in the middle of
code, which in some cases breaks movingpandas, geopandas, and python words. See linked pdf - paper.pdf. You can check how it renders on https://whedon.theoj.org.
I'll formally update my review post to include JOSS boxes, but I am happy with the package and all responses and recommend it for a publication on both pyOpenSci and JOSS.
@martinfleis Thank you for reviewing the paper! I have fixed the first two issues. I don't see how to affect the line break (third issue you mentioned) and found that this issue also appears in recently published papers, such as https://joss.theoj.org/papers/10.21105/joss.02075
Hi @anitagraser, it is correct that based on the recent pyOpenSci guideline discussion, examples for all user-facing functions would not be considered a requirement under the OK/Better/Best framework. For the overall review, if the reviewers felt that the OK requirements have been met and recommend accepting the package, then a package could be accepted to pyOpenSci.
For everyone - we welcome all additional comments on the OK/Better/Best framework and/or the overall acceptance process.
Thanks @martinfleis for expanding your review to include the JOSS tick boxes.
@xmnlab Can you please expand your review to address the tick boxes for the JOSS submission? The paper can be found at: https://github.com/anitagraser/movingpandas/blob/joss-paper/paper.md
Thanks!
@jlpalomino I am updating my review today and I will expand my review for JOSS submission section.
@anitagraser thanks for having working hard on all the recommendations.
MovingPandas documenation looks very good. I just found some minor issues. Some methods' docstring don't have the parameters section:
MovingPandas.TrajectoryCollection:
TrajectoryCollection.add_speedhttps://movingpandas.readthedocs.io/en/latest/trajectorycollection.html#movingpandas.TrajectoryCollection.add_speed
MovingPandas.Trajectory
Trajectory.add_directionhttps://movingpandas.readthedocs.io/en/latest/trajectory.html#movingpandas.Trajectory.add_directionTrajectory.add_speedhttps://movingpandas.readthedocs.io/en/latest/trajectory.html#movingpandas.Trajectory.add_speedTrajectory.to_crshttps://movingpandas.readthedocs.io/en/latest/trajectory.html#movingpandas.Trajectory.to_crs
I am starting to review JOSS section now. thanks again!
Some suggestions related to docstrings (not required):
- Inside
Parameterssection, you can useoptionalanddefaultinformation to add more details about the parameter, for example:
x : int, optional
Some description of parameter `x` (the default is -1, which implies summation over all axes).
ref: https://numpydoc.readthedocs.io/en/latest/format.html
- When a parameter can only assume one of a fixed set of values, those values can be listed in braces, with the default appearing first, for example:
order : {'C', 'F', 'A'}
JOSS Review section:
It looks very good! I just noticed a minor issue, also mentioned by @martinfleis , related to some word break like GeoPandas, MovingPandas.
thanks for working on that @anitagraser , let me know if I can help in anyway.
MovingPandas documenation looks very good. I just found some minor issues. Some methods' docstring don't have the
parameterssection:
Thank you, @xmnlab! I've added the missing parameter sections: movingpandas/movingpandas@b98b404
v0.3rc1 has landed on conda-forge https://anaconda.org/conda-forge/movingpandas
awesome! @anitagraser thanks for letting us know about the new release :)
Final approval (post-review)
- The author has responded to my review and made changes to my satisfaction. I recommend approving this package.
Thanks @anitagraser for working on the recommendations and congrats for the hard work you are doing on MovingPandas!
Thank you @martinfleis and @xmnlab for your final reviews of MovingPandas. We also really appreciate all of your feedback on the review process.
Great work @anitagraser! Both reviewers have provided their final approvals for acceptance to pyOpenSci. I will follow up with next steps shortly. Thanks.
Thank you for all you feedback @xmnlab, much appreciated!
Hi @anitagraser you may have seen that I have followed up with @martinfleis to tie up loose ends related to two open issues from his review.
One item is for you is about versions. The last JOSS check for the editor is: Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?
It seems that there is a slight difference between the GitHub version 0.3.rc2 versus Zenodo version v0.3.rc1 and conda forge version v0.3.rc1. Please let me know if I am misinterpreting this information. Thanks!
Thank you @jlpalomino!
Concerning versions: in my experience, it is common practice to increase the version in Github immediately after a release. Think of it as a necessary step for starting work on the next release.
@jlpalomino I am happy with all changes. I thought I indicated it clearly above, but apparently not enough. There is nothing to be resolved from my side.
@anitagraser versions should be ideally in sync between GitHub, PyPI, conda-forge and zenodo.
I've reverted the version number to rc1 movingpandas/movingpandas@bba76fe
Thanks @martinfleis for your prompt responses about those issues (for final documentation purposes) and for your response related to the versions.
@anitagraser thanks for your action on the version. I checked in with @lwasser about the version difference as well. We have typically followed the version sync suggested by @martinfleis, that the GitHub repository remains the same version and only changes when there is a new release. However, we understand that there could be commits happening before the new release. She has suggested that we open a discussion on discourse to see what others think is best practice for versions.
That said, let's start moving forward with closing this review, while we wait for the community to weigh in. I will post a new comment with the next steps.
MovingPandas has been approved for pyOpenSci! Thanks @anitagraser for this submission and @martinfleis and @xmnlab for your detailed reviews!
@anitagraser Here are the next steps:
- Add the badge for pyOpenSci peer review, with the second link being the URL to this issue:
- Submit a PR to
pyopensci.github.ioto add MovingPandas to the pyOpenSci package list and to add yourself as a contributor using:
contributor_type:
- package-maintainer
and
packages-submitted: ["movingpandas"]
If you are interested (not required), you can write a blog post for the pyOpenSci website about MovingPandas (see blog post about pandera) to promote your package.
The last action item is for me to get the process started with JOSS, who will provide more information on their process. I will do that in a new comment.
Please feel free to let me know if you have any questions, and congrats again.
Hi @arfon pyOpenSci has approved MovingPandas. @anitagraser is interested in JOSS publication, and the draft paper has been reviewed by the pyOpenSci reviewers.
@anitagraser has another publication that she feels demonstrates minimal overlap:
Graser, A. (2019). MovingPandas: Efficient Structures for Movement Data in Python. GI_Forum ‒ Journal of Geographic Information Science 2019, 1-2019, 54-68. doi:10.1553/giscience2019_01_s54.
Please feel free to contact @anitagraser (or myself if needed) for any additional information needed for the JOSS review process. Thanks!
@anitagraser - feel free to open an issue on the JOSS repository about this paper. On initial inspection I would say that JOSS would not accept a paper about MovingPandas as the earlier publication looks to be describing essentially the same software.
Thank you for your feedback, @arfon!
As mentioned in #18 (comment), I wasn't sure if prior publications would be an obstacle. Graser (June 2019) describes the concepts underlying MovingPandas and presents an early unreleased version of the library. (The first release was published later in Sept 2019 https://pypi.org/project/movingpandas/#history.) MovingPandas has evolved considerably since then.
If you think that this is clearly against JOSS requirements, I won't pursue it further.
@anitagraser - could you summarize the major changes between MovingPandas when Graser (2019) was submitted compared to how it is today?
We do allow for multiple papers for the same piece of software but would expect at least a major release of the software to warrant an additional paper.
@arfon Thank you for the clarification!
Graser (June 2019) only describes the Trajectory class and it's data handling functions.
The key improvements since then are:
- New data structure
- TrajectoryCollection class: handles sets of Trajectories and enables convenient exploration and easier access to individual trajectories
- New functionality
- Plotting of Trajectories and TrajectoryCollections
- Static plots using Matplotlib
- Interactive plots using Holoviews
- Aggregations for TrajectoryCollections
- Plotting of Trajectories and TrajectoryCollections
- New project infrastructure
- Releases on pypi and conda-forge
- Docs on readthedocs.io
- Extensive tutorial notebooks
- Code coverage monitoring
- Automatic testing with Travis CI
@arfon Do you suggest moving this over to the JOSS repository?
This is interesting because we need to define what is a major improvement. It seems to me that infrastructure improvements are excellent but they wouldn't warrant a new publication. Releases on pypi and conda forge, etc would not warrant a new publication nor would docs altho we do want to encourage docs for all packages in the review!!
so the question becomes is the plotting and aggregation functionality listed above enough to justify a new publication ? @arfon do we need to bring anyone else in here to help with the decision for JOSS or should i chat with the ropensci folks? We understand if this is not enough of a new release to justify a new publication. We want to ensure that software publication via JOSS is robust and there is not duplication of ideas published. We also want to continue a healthy working relationship with JOSS!!
Hi @anitagraser and @lwasser, apologies for the slow reply here. @anitagraser - thanks for documenting the changes that you've made to this package since the last paper was published.
After giving this some thought, I don't think we would accept this as a JOSS submission. Partly because of the reason that @lwasser mentions (infrastructure improvements are obviously useful/important but not actually new functionality in the software) but also that the changes in the TrajectoryCollection and supporting classes looks to be relatively modest and should they be part of a standalone package would likely fall into our 'minor utility' category.
This is interesting because we need to define what is a major improvement.
I completely agree. Unfortunately JOSS doesn't have docs to help with this. I'd be open to having a broader discussion to try and define this.
We want to ensure that software publication via JOSS is robust and there is not duplication of ideas published.
Thank you, it's much appreciated and sorry again for the delayed response here.
@arfon Thank you for the clarification!
@arfon thank you . we understand. and we'd be happy to participate in a discussion surrounding what defines a major improvement. It will make things simpler for future reviews as we can provide that information to the maintainer at the beginning if they are interested in JOSS as well!
@anitagraser i think we can close this issue. Are you interested / do you have the time to write a blog post - similar to what Niels did for pandera:
https://www.pyopensci.org/blog/pandera-python-pandas-dataframe-validation
no pressure as i know there is a lot going on now AND you are welcome to take your time doing it. We will then put the word about about your package and link to the post which describes it in more detail! let me know what you think!
@martinfleis We noticed that you are not listed on our contributors page. When you get a chance, can you please submit a PR to pyopensci.github.io to add yourself as a contributor using:
contributor_type:
- reviewer
Thanks everyone! I am officially closing this issue.
@anitagraser feel free to reach out about a blog post on MovingPandas if you are interested!
yea!! great work everyone getting another package through review!!
hey 👋 @jlpalomino @martinfleis ! I hope that you are all well. I am reaching out here to all reviewers and maintainers about pyOpenSci now that i am working full time on the project (read more here). We have a survey that we'd like for you to fill out so we can:
- invite you to our slack channel to participate in our community (if you wish to join - no worries if that is not how you prefer to communicate / participate).
- Collect information from you about how we can improve our review process and also better serve maintainers.
The survey should take about 10 minutes to complete depending upon how much you decide to write. This information will help us greatly as we make decisions about how pyOpenSci grows and serves the community. Thank you so much in advance for filling it out.
NOTE: this is different from the form designed for reviewers to sign up to review.
If there are other maintainers for this project, please ping them here and ask them to fill out the survey as well. It is important that we ensure packages are supported long term or sunsetted with sufficient communication to users. Thus we will check in with maintainers annually about maintenance.
Thank you in advance for doing this and supporting pyOpenSci.
@xmnlab i pinged you on another issue. you can just type in two packages that you have reviewed or we can sort this out later as well as you are now an editor too!! :)
@anitagraser my apologies i'm copy/ paste efficient today. can you kindly read the above issue ^^^ and fill out the survey 🔗 HERE IS THE SURVEY LINK 🔗
hey there @jlpalomino @xmnlab 👋 Just a friendly reminder to take 5-10 minutes to fill out our survey . We really appreciate it. Thank you in advance for helping us by filling out the survey!! 🙌
✨ Martin and Anita -- thank you so much for taking the time to fill it out 🙌