pyOpenSci/software-submission

PyGMT: A Python interface for the Generic Mapping Tools

weiji14 opened this issue Β· 73 comments

Submitting Author: Wei Ji Leong (@weiji14)
All current maintainers: (@weiji14, @seisman, @maxrjones, @michaelgrund, @willschlitzer, @yvonnefroehlich)
Package Name: PyGMT
One-Line Description of Package: A Python interface for the Generic Mapping Tools
Repository Link: https://github.com/GenericMappingTools/pygmt
Version submitted:
Editor: @lwasser
Reviewer 1: @jbusecke
Reviewer 2: @SimonMolinsky
Archive: Zenodo Archive
Version accepted: V 0.7.0
Date accepted (month/day/year): 9/1/2022


Description

  • Include a brief paragraph describing what your package does:

    PyGMT is a library for processing geospatial and geophysical data and making publication quality maps and figures. It provides a Pythonic interface for the Generic Mapping Tools (GMT), a command-line program widely used in the Earth Sciences.

Scope

  • Please indicate which category or categories this package falls under:

    • Data retrieval
    • Data extraction
    • Data munging
    • Data deposition
    • Data visualization
    • Reproducibility
    • Geospatial
    • Education
    • Unsure/Other (explain below)
  • Explain how the and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:

    • Data munging βš™οΈ: Several data processing modules are available to work on tabular and grid data (https://www.pygmt.org/v0.4.0/api/index.html#data-processing) including functions like blockmedian, surface, grdtrack, etc
    • Data visualization πŸ—ΊοΈ: Plotting publication quality maps and figures is a strong point of PyGMT (https://www.pygmt.org/v0.4.0/api/index.html#plotting). There is a comprehensive set of tools for plotting 2D and 3D data, both geospatial and non-geospatial. It is also possible to add standard map elements like a colorbar or legend, and make multi-panel subplot figures.
    • Geospatial 🌐: PyGMT is inherently a geospatial package that can handle both vector and raster data. Several map projections are supported (https://www.pygmt.org/v0.4.0/projections/index.html), and there is support for plotting shapefiles and GeoTiff/NetCDF files.
  • Who is the target audience and what are scientific applications of this package?

    • The primary target audience of PyGMT is the geoscience community, including anybody working on fields like Earth Observation, Geophysics, Oceanography, Seismology, Planetary Sciences, etc. The package offers a way for users to perform general geoprocessing tasks and create high quality illustrations of their results for posters or publications.
  • Are there other Python packages that accomplish the same thing? If so, how does yours differ?

    Yes, there are several geospatial Python packages such as:

    These tools are typically focused on one thing only (e.g. plotting maps, vector data, raster data). PyGMT allows users to mix vector and raster data easily, so that users can:

    • Produce a map with points/lines/polygons and raster grids in the same plot, e.g. plot weather station points on top of a raster Digital Elevation Model, while including map elements (e.g. scalebar, compass arrow, colorbar, legend, inset overview maps etc)
    • Extract a series of elevation values along a transect line (vector) from a NetCDF/GeoTiff grid (raster) using grdtrack
    • Convert a table of xyz points (vector) to a 2D grid (raster) using surface via a spline-based interpolation method.

    PyGMT integrates with the PyData ecosystem. It allows users to process and plot data stored in NumPy arrays, Pandas DataFrames, Xarray DataArray/Datasets, GeoPandas GeoDataFrames, as well as from standard file formats like CSV and GeoTiff/NetCDF files. There are also plans to integrate with other scientific libraries like ObsPy (GenericMappingTools/pygmt#967) in the future.

    However, PyGMT can only produce static plots unlike packages like Geoviews which allow for interactive plotting (e.g. panning and zooming). It is also non-trivial to scale out data processing tasks and/or plotting of big datasets (>10GB) that don't fit in RAM to multiple processors/clusters as with libraries like dask-geopandas because of limitations in the GMT C package that PyGMT wraps around.

  • Any other questions or issues we should be aware of?:

    First off, thanks for reading this! We're keen to get some independent reviewers to have a look at the PyGMT package, sort of as a first step before going for a software publication at JOSS/G3 (the team is still deciding). Linking to original discussion at GenericMappingTools/pygmt#677 (comment), cc @leouieda @seisman @meghanrjones.

P.S. *Have feedback/comments about our review process? Leave a comment here

hey there @weiji14 Welcome to pyopensci and please excuse the delayed response! We will review this submission and will get back to you with next steps soon. I should be able to find some folks in the geospatial space to review.
I can also walk you through the JOSS piece. I'd encourage it because once our review is done, JOSS simply accepts our review as theirs so the work you have to do in order to get a cross-ref enabled citation is small. but of course the decision is yours. happy to chat more if you have questions.

Hi @lwasser, no worries on the delay. We've been keen to get some good independent reviews on our package, so take your time with the search!

As for the JOSS piece, I'll need to check with the rest of the team on what they prefer. A few of us have actually edited/reviewed JOSS papers, and I've seen and heard many great things about the review process too. We'll definitely have a good think about this once the review gets underway.

hey there @weiji14 and all involved with this package! i started to go through the editor checks here and notice a disclaimer on your readme that says the package is currently undergoing rapid development. pyOpenSci reviews can be compared to JOSS reviews. Ideally the review happens when the tool has achieved some level of development stability. While we may add a process for a second review if you say added a significant chunk of functionality in the future that wasn't on the radar now, this review can be seen similar to a paper review where the paper is not a draft, it's a fully edited paper. Unlike a paper however software evolves and that is ok. but we wouldn't want version .5 here in pyopensci as an archive and then it moves to JOSS for version 1.0. We'd rather review version 1.x and then push it through JOSS as version 1.x given we collaborate with them closely. Can you kindly let me know what your goals are for this review and what stage this tool is in. I'd love it to go through our process but want to ensure it's ready.
Many thanks.

@weiji14 @leouieda we've been discussing this in an issue. I just want to be transparent and careful about when we review packages. The one specific question i have is related to this statement in your readme:

All of the API (functions/classes/interfaces) is subject to change until we reach v1.0.0 as per the semantic versioning specification. There may be non-backward compatible changes as we experiment with new design ideas and implement new features. This is not a finished product, use with caution.

do you have any planned significant API changes at this point? Or do you feel like the package is relatively stable and ready for a solid review which may help you get to 1.x. many thanks! Once we have this resolved I will kick start the review. I appreciate your patience.

Hi @lwasser, thank you opening the discussion at pyOpenSci/software-peer-review#101. The goals for the PyGMT team at this stage are really to see if there are any major issues in terms of our API design, and to ensure that we are following established best practices in software design. Essentially, have a fresh set of eyes look at our documentation and see if there are any glaring issues we may have missed.

All of the API (functions/classes/interfaces) is subject to change until we reach v1.0.0 as per the semantic versioning specification. There may be non-backward compatible changes as we experiment with new design ideas and implement new features. This is not a finished product, use with caution.

do you have any planned significant API changes at this point? Or do you feel like the package is relatively stable and ready for a solid review which may help you get to 1.x. many thanks! Once we have this resolved I will kick start the review. I appreciate your patience.

To be honest, that disclaimer has been sitting there for a year and a half now since Apr 2020 or v0.1.0. We're currently working on a v0.5.0 release for next week (29 Oct 2021), which included a lot of work on API standardization (see e.g. GenericMappingTools/pygmt#1479), so that disclaimer could probably be toned down a little. I'll make a point to update this disclaimer for the next release.

In terms of making backward incompatible changes, we actually have a deprecation policy for a while now since Apr 2021 or v0.4.0 (see GenericMappingTools/pygmt#1160). This ensures that users are given time (typically 2 minor releases or about 6 months) and fair warning when things might break in the future. We've actually just reached 400 stars on GitHub this week, and judging from activity on our forum, we're increasingly aware that there's a significant userbase and things are definitely getting to a stage where we have to be very careful with breaking other people's code.

So in short, I do think PyGMT is stable enough for review, and that going through the pyOpenSci review process will help us get to v1.x.x in a structured manner. Let me know if you have any other concerns, and I can discuss this with the rest of the PyGMT team πŸ˜ƒ

hi @weiji14 ! thank you for the thoughtful response. I will move forward with the review! When we get reviews it also allows us to reflect on our process and guidelines and so this review will help us adjust our language around "under development" packages (which all packages are technically always iteratively improving). Do you have a suggestion for a reviewer?

I will also ping the community. We often select one reviewer that is suggested and another will be selected by us. If you don't have any suggestions then I will look for two. Again thank you for your patience!

Editor checks:

  • Fit: The package meets criteria for fit and overlap.
  • Automated tests: Package has a testing suite and is tested using CI (e.g. GitHub Actions, Travis-CI, CircleCI, etc).
    • Github actions
  • Documentation: Package has documentation setup (ReadTheDocs, JupyterBook, website,etc.).
  • License: The package has an OSI accepted license
    • BSD 3-Clause "New" or "Revised" License
  • Repository: The repository link resolves correctly
  • Archive Zenodo. (may be post-review): The repository is archived somewhere outside of GitHub. We suggest using Zenodo and created releases for this purpose.
  • Version (may be post-review) Does the release version given match the GitHub release (v1.0.0)?

Due Date: Dec 27 2021(this is a holiday week in the US)
Reviewers: @SimonMolinsky @jbusecke
More to come when we find reviewers.

Thanks @lwasser, I don't have a specific reviewer in mind for now, but perhaps someone involved in the Pangeo community would be a good fit? Also if possible, it would be better to start the review process after PyGMT releases v0.5.0 next Friday to capture some of the API standardization changes we've made recently. I suppose it might take time for you to find a reviewer or two, but starting the review sometime early Nov 2021 would be good.

ok @weiji14 let's shoot for early november. i can work on finding reviewers and let them know that we will wait until that release / early november to begin. I also can reach out to the pangeo community. All sounds good!

I would be happy to review this after I am back from vacation (mid-late november). Please let me know if that fits into your timeline.

Thanks @jbusecke !! that is great.
hi there @weiji14 would a slightly later timeline work for you if we started the review in say mid november but provided @jbusecke with a bit of additional time if they are not ready until later in the month? i have one other reviewer that sounds game as well who i will ping shortly.

Hi @weiji14 , Hi @lwasser ,

I'm available as a reviewer and I may start as soon as it will be possible :)

Thanks @jbusecke !! that is great. hi there @weiji14 would a slightly later timeline work for you if we started the review in say mid november but provided @jbusecke with a bit of additional time if they are not ready until later in the month? i have one other reviewer that sounds game as well who i will ping shortly.

Cool, thanks for offering to review @jbusecke! Mid to late November sounds good, we're not in a hurry πŸ˜€

ok wonderful. it looks like we have two reviewers @jbusecke @SimonMolinsky thank you both for volunteering! How does Monday November 15 sound for a start date for this review? The review would then be due on December 6. Does that work for everyone? i can remind everyone in ~two weeks as well about the review period beginning and will add links to our review template as well.

Sounds good to me.

Hi, I'm fine with this timeline :)

Awesome. Thank you, All! i'll check back in on the 15th with specific next steps.

@lwasser, what are the next steps? Now I have a window to review the package. I assume that December will be rather busy so it's better to start now :)

absolutely. @SimonMolinsky thanks for the ping. For some it is a holiday week so I was assuming that may impact the review but let's do this. Simon and @jbusecke thank you for being willing to review. let's set the deadline for 4 weeks which is one week longer than usual. i am doing this because of the holiday week, the christmas holiday for some and AGU for some to ensure everyone has time. Let's set the deadline for December 27th. However if you have time to do it now, please work on it and get it in when you can. that week of the 27th is also a holiday for us so we can be flexible with this review.

feel free to submit feedback as issues (or pr's) to pygmt if you wish and reference this issue here. This has worked for other reviews . @weiji14 are you comfortable with issues from reviewers? When you finish the review, use the review template to submit here . please get in touch here with any questions as you proceed.

I will check back in around the 27th but please know that is a holiday week where I work.

Thank you all!

Yep, we're ok with people making issues on the repo. The 4 week timing sounds good too.

Thanks for the update. I will try to work on this in the coming week, but appreciate the longer deadline just in case 😁.

Quick question: Should I post my partially filled review here and edit it as issues are responded to/resolved? Or should I keep that to myself until the deadline?

hi @jbusecke ! that it totally up to you. In some cases while someone is reviewing they open issues and refer to it here in their review. In that case it might make sense to open a comment now and edit as you go adding issues, etc . in other cases people just post the full review when it's complete. you don't have to wait until the deadline regardless. when there are things the authors could be thinking about now it could be useful to post earlier. so in short - it is your call! thank you for asking!

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README.
    • I noted here that installing with mamba is much faster and perhaps could be added to the install docs.
  • Vignette(s) demonstrating major functionality that runs successfully locally
    • Some issues with the method signature when doing slight modifications example (for more see comments below)
  • Function Documentation: for all user-facing functions
  • Examples for all user-facing functions.
    • See below.

    BTW this seems somewhat redundant with the point in 'Useability' below? Or is there a difference I did overlook @lwasser

  • Community guidelines including contribution guidelines in the README or CONTRIBUTING.
  • Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a setup.py file or elsewhere.

Readme requirements
The package meets the readme requirements below:

  • Package has a README.md file in the root directory.

The README should include, from top to bottom:

  • The package name
  • Badges for continuous integration and test coverage, a repostatus.org badge, and any other badges. If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the badge for pyOpenSci peer-review will be provided upon acceptance.)
  • Short description of goals of package, with descriptive links to all vignettes (rendered, i.e. readable, cf the documentation website section) unless the package is small and there’s only one vignette repeating the README.
  • Installation instructions
  • Any additional setup required (authentication tokens, etc)
  • Brief demonstration usage
    • Not quite "brief", but the video demo is awesome!
  • Direction to more detailed documentation (e.g. your documentation files or website).
  • If applicable, how the package compares to other similar packages and/or how it relates to other packages
  • Citation information

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Package structure should follow general community best-practices. In general please consider:

  • The documentation is easy to find and understand
  • The need for the package is clear
  • All functions have documentation and associated examples for use

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • [ ] Performance: Any performance claims of the software been confirmed.
    • (I didnt find any particular statements about performance and my examples generally ran zippy.
  • Automated tests: Tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Continuous Integration: Has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.
  • Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 3-4


Review Comments

My general assesment of pygmt is very positive. The code seems very well organized, and the documentation is overall excellent (the examples really helped me understand the capabilities of GMT and pygmt)! So overall I would happily recommend this package to be accepted. I gave a few minor suggestions (see above), but none of these is really a 'major' revision from my end.

The one major friction point to me was the (I guess GMT native?) syntax for e.g. setting titles. In several places I was wondering if it wouldnt be possible to (optionally) hide this aspect of GMT even more from the user? Most of the other syntax is very easy to understand for someone like me (who uses matplotlib very often), but the whole time I wanted a fig.set_title() or fig.title() or even fig.basemap(frame='a', title='...') instead of fig.basemap(frame=["a", '+t"Trinidad and Tobago"']). Providing these rather complex string lists makes this package much harder to use in my eyes. I do not at all regard this as an issue that needs to be implemented for the review, but I still wanted to point this out.

Thanks to all the maintainers for this great work!

A couple of random comments:

  • You say "GMT shines when it comes to plotting data on a map." Should the first figure section of the docs maybe include a simple arrow from one location to another to show this off?

hi everyone! @jbusecke thank you for your review. @SimonMolinsky we look forward to your review when you are ready to submit. Do you have any questions or need any help?

The next week is a holiday in the US and we are completely off starting this afternoon (december 24) through the new year. I will be back online on January 3 2022!

In the meantime, @SimonMolinsky if you can wrap up your review in the next week that would be wonderful. @weiji14 feel free to begin working on the reviewer comments and and addressing them. Please reference this issue as you submit pr's and open issues to address so we have some documentation of how the package was improved / modified through this process.

All - happy early new year to you! If you need anything today please reach out. otherwise i will check back in - in January 2022 and we can work on wrapping up this review!

Hi @lwasser , Thanks for the reminder! I don't have any questions at all and my review process is in progress so I'll fit into a deadline :)

Happy New Year and the calming Holidays for all of you!

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally

Not provided in a textual form but linked as a tutorial video, so for me, it's ok. However, I think that the short copy & paste example below the clip could be applicable for more advanced users who would like to test the package quickly. For example, it could look like:

  • Youtube lecture,

  • Short installation,

  • Copy/Paste example.

  • Function Documentation: for all user-facing functions.

  • Examples for all user-facing functions

The package has a vast number of examples and tutorials.

  • Community guidelines including contribution guidelines in the README or CONTRIBUTING.
  • Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a setup.py file or elsewhere.

Readme requirements
The package meets the readme requirements below:

  • Package has a README.md file in the root directory.

The README should include, from top to bottom:

  • The package name
  • Badges for continuous integration and test coverage, a repostatus.org badge, and any other badges. If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the badge for pyOpenSci peer-review will be provided upon acceptance.)
  • Short description of goals of package, with descriptive links to all vignettes (rendered, i.e. readable, cf the documentation website section) unless the package is small and there’s only one vignette repeating the README.
  • Installation instructions
  • Any additional setup required (authentication tokens, etc)
  • Brief demonstration usage
  • Direction to more detailed documentation (e.g. your documentation files or website).
  • If applicable, how the package compares to other similar packages and/or how it relates to other packages
  • Citation information

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Package structure should follow general community best-practices. In general please consider:

  • The documentation is easy to find and understand
  • The need for the package is clear
  • All functions have documentation and associated examples for use

Functionality

  • Installation: Installation succeeds as documented.

I've installed the package on the macOS (Intel setup) without any problems.

  • Functionality: Any functional claims of the software been confirmed.

I've run all the example tutorials with my datasets and everything has worked fine, but I have a few suggestions (at the bottom of a review).

  • Performance: Any performance claims of the software been confirmed.

The package is fast and the download of package datasets runs smoothly.

  • Automated tests: Tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.

  • Continuous Integration: Has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.

Yes, the package has Github Actions defined for testing and CI for the leading Operating Systems: Ubuntu (Linux), macOS, Windows.

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 8


Review Comments

The general feel is very good! I really liked the package but more important is ecosystem around it. The great resources and documentation makes use of the pyGMT very simple and I can create desired maps very fast. I'll definitively move from the old matplotlib into pyGMT.

However, I have few comments about the usability that are not falling into issues, rather suggestions or recommendations:

  1. I don't understand why examples in the package repository are presented as Python files. Why not Notebooks? Is this related to the automatic document conversion? You can transform Notebooks with nbconvert into the documentation. I rather use Github than package website to explore its capabilities and it will be much easier to follow stylized Notebook. I'm not a cartographer so there is additional overhead to understand what's going on within the script and explanations written as a Python comments are very unfriendly for a reader. And for cartographers the images within examples may be more useful than the code itself. If there is a big block of text then I start to get lost within it. If you are into the change I can create those notebooks and check if documentation converts appropriately because I understand that this take time. On the other hand, presentation as a script forces user to code, not only download notebook and run cells. For education it's a good thing. But from the perspective of Data Scientist that need your package for work it complicates matter a lot.
  2. There are cryptic errors which are coming from GMT. The example - I'll change borders of the region bounding box that lower coordinate comes after the higher:
# This is ok
fig.coast(region=[20, 31, 59, 71], shorelines=True)

# This isn't
fig.coast(region=[31, 20, 59, 71], shorelines=True)

And error:

---------------------------------------------------------------------------
GMTCLibError                              Traceback (most recent call last)
/var/folders/72/m9ty2jns5m96fc2_2ll55vyh0000gn/T/ipykernel_45878/1371627446.py in <module>
----> 1 fig.coast(region=[31, 20, 59, 71], shorelines=True)

~/miniconda3/envs/pygmt/lib/python3.10/site-packages/pygmt/helpers/decorators.py in new_module(*args, **kwargs)
    584                     )
    585                     warnings.warn(msg, category=SyntaxWarning, stacklevel=2)
--> 586             return module_func(*args, **kwargs)
    587 
    588         new_module.aliases = aliases

~/miniconda3/envs/pygmt/lib/python3.10/site-packages/pygmt/helpers/decorators.py in new_module(*args, **kwargs)
    724                         kwargs[arg] = separators[fmt].join(f"{item}" for item in value)
    725             # Execute the original function and return its output
--> 726             return module_func(*args, **kwargs)
    727 
    728         return new_module

~/miniconda3/envs/pygmt/lib/python3.10/site-packages/pygmt/src/coast.py in coast(self, **kwargs)
    190         )
    191     with Session() as lib:
--> 192         lib.call_module("coast", build_arg_string(kwargs))

~/miniconda3/envs/pygmt/lib/python3.10/site-packages/pygmt/clib/session.py in call_module(self, module, args)
    498         )
    499         if status != 0:
--> 500             raise GMTCLibError(
    501                 f"Module '{module}' failed with status code {status}:\n{self._error_message}"
    502             )

GMTCLibError: Module 'coast' failed with status code 72:
coast [ERROR]: Option -R parsing failure. Correct syntax:
coast [ERROR]: Offending option -R31/20/59/71

I understand that it is a lot of work to catch those exceptions and make them more readable. At the same time I highly recommend to do so. From my perspective, errors and exceptions from the wrapped C/C++ libraries sometimes may generate more noise than help, and it may affect the decision if package will be used or not. (The good example are GDAL errors, or TensorFlow errors - we are not happy if we need to set software based on those packages in production). This could be done because:

  1. The code needs to be more Pythonic. If I'm forced to pass strings like that `shorelines='1/0.5p,gray' I would rather go for the GMT - why to bother with Python if syntax is not Pythonic? I understand that it is a process under way and I saw your development stages and new aliases build upon some parameters and functions and I really appreciate that. But I'd like to emphasize that this syntax should be available along with Python dictionaries to set image parameters so don't leave loose ends. String syntax make code automation and debugging much harder, and I even don't talk about the readability.

Anyway, thanks a lot! I'm into pyGMT and I love your work and time put into the package.

@SimonMolinsky @jbusecke thank you both for these very detailed reviews.
@weiji14 - there are now two reviews that you can address. Ideally you can reference this review when submitting issues / pull requests as it makes sense. If you have questions, comments or concerns please let us know. When you are done addressing the reviews, please come back here and tell us what has been changed to address them and we can work towards wrapping up the review!

Happy Happy New Year all!

Happy New Year everyone! 🎊 Thank you for the reviews and very positive comments!

Just wanted to chime in on the use of notebooks for examples since I set this up way back when we first started the package:

  • We use sphinx-gallery to generate both the gallery examples and the tutorials from the Python files. They generate the notebooks automatically when building the docs (see links at the end of each tutorial page).
  • Notebooks are great for users but reviewing changes to them in Pull Requests is a nightmare for maintainers since diffs are unreadable without using some third party tool like ReviewNB.
  • Notebooks that have images can get very large, which causes active repositories to quickly become huge (~100s-1000s Mb).
  • It's relatively easy to fix a typo or make a small change in a script, less so in a notebook (need to open jupyter locally).
  • Enforcing style and formatting (flake8/black/etc) in notebooks is not trivial.

I had initially tried using notebooks in previous projects but always came to regret it after a while because of these problems.

hey there @leouieda πŸ‘‹ !!
I think that all of your comments above are reasonable. Sphinx gallery is great because it does provide that download link in both notebook and .py format at the bottom of each gallery item so i think it addresses the issue of anyone who might want a notebook without the issues you mentioned above. Thank you so much for the thoughtful reply. I've also thought about building a documentation working group so we can have more information about options for documentation such as sphinx gallery and some of the potential pitfalls such as those you mentioned above!

When you and the pygmt team have time, please give us a general sense of how much effort those review comments will take to work through. As with any review, you are fully welcome to respond to comments and address issues as the team sees fit.

Thank you again @jbusecke and @SimonMolinsky for your reviews. Once the pygmt team has addressed both reviews, I will ask you both to address whether the package updates address the items that you highlight sufficiently.

Hi all, thanks for all the great feedback. I will provide a separate response to all the longer review comments, but the ones below seem like the key issues the PyGMT team will need to work on in the coming weeks:

My estimate that we can complete the above in 1-2 weeks, or by the end of January at the very latest. As for this issue:

This might take a bit longer as it covers >10 functions, and there are some technical issues to resolve on the CI side of things.

Other than than, let me know if I missed anything! Again, I'll provide a full response to the longer review comments below.

Responses to Review Comments

@jbusecke - The one major friction point to me was the (I guess GMT native?) syntax for e.g. setting titles. In several places I was wondering if it wouldnt be possible to (optionally) hide this aspect of GMT even more from the user? Most of the other syntax is very easy to understand for someone like me (who uses matplotlib very often), but the whole time I wanted a fig.set_title() or fig.title() or even fig.basemap(frame='a', title='...') instead of fig.basemap(frame=["a", '+t"Trinidad and Tobago"']). Providing these rather complex string lists makes this package much harder to use in my eyes. I do not at all regard this as an issue that needs to be implemented for the review, but I still wanted to point this out.

@SimonMolinsky - 3. The code needs to be more Pythonic. If I'm forced to pass strings like that `shorelines='1/0.5p,gray' I would rather go for the GMT - why to bother with Python if syntax is not Pythonic? I understand that it is a process under way and I saw your development stages and new aliases build upon some parameters and functions and I really appreciate that. But I'd like to emphasize that this syntax should be available along with Python dictionaries to set image parameters so don't leave loose ends. String syntax make code automation and debugging much harder, and I even don't talk about the readability.

Having Pythonic GMT arguments is something that has came up time and time again, and is a long term goal for PyGMT (see https://github.com/GenericMappingTools/pygmt/projects/6) we want to get done by v1.0. There is some debate on how the syntax should look like (see GenericMappingTools/pygmt#1082), and it will take time to fully implement all the possible GMT shortcodes. At this stage, we need to balance our developer resources on adding new features (functionality) and better API design (usability), but it is good to hear from both reviewers that the UX aspect is something we need to spend more time on! 😁

A couple of random comments:

  • You say "GMT shines when it comes to plotting data on a map." Should the first figure section of the docs maybe include a simple arrow from one location to another to show this off?

There's currently work going on to revamp the first figure tutorial (see GenericMappingTools/pygmt#770). Ping @willschlitzer so that he's aware of this suggestion :)

@SimonMolinsky - 1. I don't understand why examples in the package repository are presented as Python files. Why not Notebooks? Is this related to the automatic document conversion? You can transform Notebooks with nbconvert into the documentation. I rather use Github than package website to explore its capabilities and it will be much easier to follow stylized Notebook. I'm not a cartographer so there is additional overhead to understand what's going on within the script and explanations written as a Python comments are very unfriendly for a reader. And for cartographers the images within examples may be more useful than the code itself. If there is a big block of text then I start to get lost within it. If you are into the change I can create those notebooks and check if documentation converts appropriately because I understand that this take time. On the other hand, presentation as a script forces user to code, not only download notebook and run cells. For education it's a good thing. But from the perspective of Data Scientist that need your package for work it complicates matter a lot.

I don't have much to add to @leouieda's comment at #43 (comment), but I do use notebooks a lot myself and usually use Jupytext to keep a synced .ipynb and .py file while doing exploratory data analysis. Do check out this blog post https://devblog.pytorchlightning.ai/publishing-lightning-tutorials-cbea3eaa4b2c which mentions a similar pipeline on turning Python scripts into rendered Jupyter notebook files (not what PyGMT uses, but the core concept of keeping things lightweight is similar).

@SimonMolinsky - 2. There are cryptic errors which are coming from GMT. The example - I'll change borders of the region bounding box that lower coordinate comes after the higher:

I understand that it is a lot of work to catch those exceptions and make them more readable. At the same time I highly recommend to do so. From my perspective, errors and exceptions from the wrapped C/C++ libraries sometimes may generate more noise than help, and it may affect the decision if package will be used or not. (The good example are GDAL errors, or TensorFlow errors - we are not happy if we need to set software based on those packages in production).

A little similar to the first comment above, but on GMT error messages. Again, a long term issue (see e.g. GenericMappingTools/pygmt#256) which has its own set of technical challenges. Off the top of my head, we'll need to either 1) parse the user's inputs and make sure it is correct; and/or 2) parse the GMT error message and return a more user-friendly error message to the user. Both of these impose extra computational costs on the user (in terms of extra if-then checks), but then again, speed-orientated users would probably go with pure GMT anyway. That, and the fact that GMT has thousands of possible error messages from its accumulated three decades of development will make this a tricky beast to handle πŸ™‚

Anyways, thanks again to both reviewers and @lwasser for your time! Give us a couple of weeks to work out the issues highlighted in #43 (comment) and I'll give you all a ping once those tasks are done.

Hi @leouieda , Hi @weiji14

I suspected that tutorials written as Python scripts must make some sense but I didn't consider their memory size. It's invaluable knowledge for me too! I'll consider the same action within my projects.

I must say, that your package is like a spider's web. I've used it once when I was checking its capabilities and I will gradually move all my projects into it. It saves me a lot of time with GUI apps :) I hope that pyGTM will gain a lot of deserved attention.

Hi all, apologies for the late reply. We've made a new PyGMT v0.6.0 release last week πŸŽ‰ that has addressed the main items in #43 (comment). You can find the updated documentation page at https://www.pygmt.org/v0.6.0. The key changes relevant to this PyOpenSci review include:

@lwasser, please let us know what else we'll need to do to complete the review process. Thanks again for your time πŸ˜„

No worries Leah, I saw the news about the fire and your tweet, so I totally understand if you need more time on this. Just focus on getting to a safe space first, the rest can come later.

hey there @weiji14 i just wanted to check in as i know i've been stereo silent for too long here. i am currently moving this project to a new home and taking time off. i'm going to check in with the reviewers here to see if the changes made to pygmt satisfy their reviews... i am so sorry that this submission has gotten stuck in my job transition . Please know that in the fall (in about 2 months) i will be back here working full time on this project so this lack of progress on this review doesn't reflect the organization it just reflects this change i'm making in where the project lives and where i work!

hi there @SimonMolinsky @jbusecke i am checking in on this review. do you have time to have a look at the changes made in response to your review as posted by @weiji14 above? many thanks and my sincere apologies for the delay.

Hi @lwasser, I anticipate a larger than usual workload in anticipation of Scipy2022. Would a response after July 18th be acceptable? Very sorry to slow down the review.

Sure! We'll have a PyGMT v0.7.0 release in early July, and a SciPy 2022 talk by @maxrjones, so ok with a response around mid to end of July. Edit: v0.7.0 release is out at https://github.com/GenericMappingTools/pygmt/releases/tag/v0.7.0

Hi @lwasser , hi @weiji14 - I'll take a look into changes in the next week :)

I finally found some time (and sleep) after Scipy, and have reviewed the changes here.
Big thanks for all the developers for the many changes made in response to my review.
The only thing I noticed was that there are still a few checkmarks missing in GenericMappingTools/pygmt#1686, but given the record of development I have seen here, I am confident these will be addressed soon (so no need to request another review period from my end).
I did check the packaging guide and found that this package exceeds all the requirements.

So from my end this is a go on publication πŸš€.

PS: This was such a pleasure to review, I wish science journals would enable a workflow even close to this!

Hi @weiji14 , @leouieda , @lwasser , @jbusecke!

I had mostly propositions of improvements (more Python, less GMT), and I've checked all linked resources (projects and discussions) - I think that we are on the same page, and this is what I was looking for. I agree, that classes could be better data structures to pass "Pythonic" parameters into drawing functions (instead of dictionaries).

Your new training materials from here are excellent, and it is good to see that the package is developed as a project with the future and you build a healthy ecosystem around it.

I've updated my review points too, and from my perspective, you are ready for the next publication steps! Thank you for your tremendous work and a lot of effort put into the promotion :)

@jbusecke @SimonMolinsky wonderful - thank you both SO VERY MUCH for your thoughtful replies. And MY APOLOGIES for how delayed my response was. I've been offline but am not working full time on this project so you can expect future review processes to move forward much more smoothly. Based upon this all, @weiji14 @leouieda I wanted to congratulate you on being accepted by the pyOpenSci review process. And here comes out stock congratulations + next steps:

@weiji14 i will update below once you let me know if you plan to submit to JOSS as well. Once you've submitted to use your JOSS acceptance is fast tracked. This will give you a cross-ref accessible DOI which is nice - you don't have to go through another review. Please just let me know if you wish to do this. If not, please follow the steps below and i'll check of and close this issue once we see the badge on your readme, etc. Get in touch with any questions!!


πŸŽ‰ pyGMT has been approved by pyOpenSci! Thank you @weiji14 for submitting pyGMT and many thanks to @SimonMolinsky @jbusecke for reviewing this package! 😸

There are a few things left to do to wrap up this submission:

  • Activate Zenodo watching the repo if you haven't already done so.
  • Tag and create a release to create a Zenodo version and DOI.
  • Add the badge for pyOpenSci peer-review to the README.md of pyGMT. The badge should be [![pyOpenSci](https://tinyurl.com/y22nb8up)](https://github.com/pyOpenSci/software-review/issues/issue-number)
  • Add to the pyOpenSci website. @weiji14 , please open a pr to update this file: to add your package and name to the list of contributors
  • if you have time and are open to being listed on our website, please add yourselves to this file via a pr so we can list you on our website as contributors!

NOTE I WILL REMOVE THIS IS WE DON"T MOVE FORWARD WITH JOSS

It looks like you would like to submit this package to JOSS. Here are the next steps:
  • Login to the JOSS website and fill out the JOSS submission form using your Zenodo DOI. When you fill out the form, be sure to mention and link to the approved pyOpenSci review. JOSS will tag your package for expedited review if it is already pyOpenSci approved.
  • Wait for a JOSS editor to approve the presubmission (which includes a scope check)
  • Once the package is approved by JOSS, you will be given instructions by JOSS about updating the citation information in your README file.
  • When the JOSS review is complete, add a comment to your review in the pyOpenSci software-review repo that it has been approved by JOSS.

πŸŽ‰ Congratulations! You are now published with both JOSS and pyOpenSci! πŸŽ‰

All -- if you have any feedback for us about the review process please feel free to share it here. We are always looking to improve our process and our documentation in the contributing-guide. We have also been updating our documentation to improve the process so all feedback is appreciated!

Thanks @lwasser and all the reviewers! Just want to quickly acknowledge that we're delighted with the news, and it's been a pleasure to have such a nice open review process πŸ˜ƒ

In terms of next steps, we've got Zenodo set-up already at https://doi.org/10.5281/zenodo.3781524 (since v0.1.0 in fact), so you can tick the first 2 boxes. I'll submit PRs for the other 3 checkboxes soon (once I get some free capacity).

For the JOSS submission part, let me get back to you on that once the team has a chat on GenericMappingTools/pygmt#677 (comment). Just to understand the expedited process though (as I'm a little new to JOSS), would the JOSS review just be focused on the software paper itself and not the code/documentation? I had a look at https://www.pyopensci.org/contributing-guide/open-source-software-submissions/author-guide.html#journal-of-open-source-software-joss-submission which mentions writing a paper.md file, but not sure if there's anything else needed.

Hi @weiji14 πŸ‘‹

Regarding JOSS - that step is actually straight forward.

  1. First you will need to create the paper.md file which is probably the most time consuming part of the process. I can have a look at that before you submit if you wish but they will review it and use their whedon bot to double check format, etc.
  2. As far as the review, because we are partners with JOSS they accept our review for their process. Thus you do NOT have to go through another review. But you WILL get the cross-ref enabled DOI from them which if nice for your team to have as it will show up automagically in Orcid, etc. When / if you go through their review process please include a link to this issue / review and tell them you can be fast tracked as you've been accepted already to pyOpenSci. Ping arfon if you wish on this step. That's it.

So in short what you read is just it - you write the paper and the rest of the review process is super speedy as you've already been peer reviewed here.

Please let me know if you have any other questions about the process. And welcome again to pyOpenSci!

@weiji14 are you on twitter? i'm going to post tomorrow about this review being accepted and would love to mention you.

@weiji14 are you on twitter? i'm going to post tomorrow about this review being accepted and would love to mention you.

Yes, I'm on Twitter, but you can just tag @gmt_dev/#PyGMT and that'd be great. Prefer to give credit to the whole PyGMT team in the spirit of open source 😁

Yes, I'm on Twitter, but you can just tag @gmt_dev/#PyGMT and that'd be great. Prefer to give credit to the whole PyGMT team in the spirit of open source 😁

ok wonderful :) thanks!

Wonderful news! Thanks for letting me be part of this (by miles best I have ever had) reviewer experience. Also very delighted to learn about the overlap with JOSS.

All scientific reviews should work like this!

@jbusecke thank you!! and that's really wonderful to hear that you thought the process was good as well. In the future the process won't be as delayed (i've been offline for the past 6 months so that was my fault!).

The last step for reviewers is adding @SimonMolinsky and @jbusecke to our website as contributors, here . Are you comfortable with adding yourselves via a PR to that repo and that file?

Many thanks!

hey πŸ‘‹ @weiji14 @jbusecke @SimonMolinsky ! I hope that you are all well. I am reaching out here to all reviewers and maintainers about pyOpenSci now that i am working full time on the project (read more here). We have a survey that we'd like for you to fill out so we can:

πŸ”— HERE IS THE SURVEY LINK πŸ”—

  1. invite you to our slack channel to participate in our community (if you wish to join - no worries if that is not how you prefer to communicate / participate).
  2. Collect information from you about how we can improve our review process and also better serve maintainers.
    The survey should take about 10 minutes to complete depending upon how much you decide to write. This information will help us greatly as we make decisions about how pyOpenSci grows and serves the community. Thank you so much in advance for filling it out.

NOTE: this is different from the form designed for reviewers to sign up to review.
IMPORTANT: If there are other maintainers for this project, please ping them here and ask them to fill out the survey as well. It is important that we ensure packages are supported long term or sunsetted with sufficient communication to users. Thus we will check in with maintainers annually about maintenance.

Thank you in advance for doing this and supporting pyOpenSci.

hey there @weiji14 @jbusecke @SimonMolinsky πŸ‘‹ Just a friendly reminder to take 5-10 minutes to fill out our survey . We really appreciate it. Thank you in advance for helping us by filling out the survey!! πŸ™Œ

Wei, it's really important for us to collect information from our maintainers so that we can both stay in touch with you regarding package maintenance and also support you through time. We really appreciate your time in filling this out. I know that you have a team working on pyGMT. Please have your co-maintainers also fill it out and please list them here as well. πŸ™Œ Many thanks in advance! πŸ™Œ

πŸ”— HERE IS THE SURVEY LINK πŸ”—

hey there @weiji14 @jbusecke @SimonMolinsky πŸ‘‹ Just a friendly reminder to take 5-10 minutes to fill out our survey . We really appreciate it. Thank you in advance for helping us by filling out the survey!! πŸ™Œ

Wei, it's really important for us to collect information from our maintainers so that we can both stay in touch with you regarding package maintenance and also support you through time. We really appreciate your time in filling this out. I know that you have a team working on pyGMT. Please have your co-maintainers also fill it out and please list them here as well. πŸ™Œ Many thanks in advance! πŸ™Œ

πŸ”— HERE IS THE SURVEY LINK πŸ”—

Done! 😊

Cool, thanks for the reminder Leah, just completed the survey!

Done!

Done

Thank you so so much @weiji14 @michaelgrund @jbusecke !!! πŸŽ‰ πŸŽ‰ So appreciative of your time.

hey y'all - i am going to close this issue. i suspect the JOSS submission after our review might now happen at this point. but if you'd like it to please reach out and we can reopen!! Hope you are all well!!

Hi @weiji14, I’m currently working for @lwasser on community outreach for pyOpenSci, and as part of this I’m doing some social media posts to promote the packages accepted through our peer review process like this one: https://twitter.com/pyOpenSci/status/1666155502877671461?s=20

I wanted to ask you about PyGMT's logo. I didn’t find a logo in the package’s GitHub repo so I was wondering if you have one somewhere or are planning to work on one.

For packages that don’t have a logo I was thinking about writing the name of the package in a nice font/color. I will like to know if you have any ideas of colors/fonts you would like to use since I’ll like to make sure the people involved in the package like the way it it is promoted. I'm happy to send you some ideas and work on this with you while you have your own logo :).

Hi @juanis2112, thanks for reaching out! A PyGMT logo is something we've been meaning to make for years (see https://forum.generic-mapping-tools.org/t/design-for-pygmt-logo/1346), but there's no official version yet. If you want to start a new thread somewhere, we can discuss this in more detail πŸ˜„

I opened an issue on PyGMT's repo GenericMappingTools/pygmt#2579 to start a discussion with the ideas people have posted and maybe do some voting to decide which direction to explore more.

hi there @weiji14 / colleagues πŸ‘‹ we are doing a bit of housekeeping. i have the names of all maintainers (i think) for this package but am missing GitHub usernames. Could someone please reply here and provide the GitHub handles for everyone? We want to credit all maintainers on our website! and to do that i need to add the GH handles to this issue metadata at the top! Many thanks in advance for doing this!

Hi @lwasser, sorry for the late reply! Our list of maintainers are listed at https://www.pygmt.org/v0.9.0/team.html:

  • @seisman
  • @maxrjones
  • @michaelgrund
  • @weiji14
  • @willschlitzer

Could you post the link on the website when this is updated? Thanks!

@weiji14 no worries! thank you for replying! i just wanted to be sure we are recognizing all of the maintainers on our package listing page. i've added everyone to this issue so the data will get updated in a few weeks (i have a bug in our code to fix and then it can be updated!!). i hope all is well!!

once i perform that update (i could do it now if that bug didn't exist) i'll let you know and add a screen shot here! but i did update the issue which the code parses so everyone gets recognized!!

Screen Shot 2023-09-11 at 4 09 46 PM

@weiji14 i just updated the website!! all maintainers are now credited on the pygmt package page!! And they should be listed on our contributor page too!!

in the future you will also be able to filter by maintainer on our contributor page - that is a feature i plan to add!

thank you for helping me flesh out the maintainer list here. we really want to make sure all of the authors / maintainers get credit for the great work done on these tools!

Thanks @lwasser, looks great! Would it be possible to update the list to add @yvonnefroehlich who recently got added as a maintainer? If you point me to the right page, we can update the list with a PR.

@weiji14 absolutely. you can actually update this issue here if you wish at the top (i think because you opened it you can update it - if you can't let me know).

I have a ci action that parses these reviews and grabs maintainers. we've been updating all of the issues. it runs on a cron job but we can also trigger it to update things at any time.

Screen Shot 2023-09-12 at 4 49 55 PM

Cool, I've updated the maintainer list above, and see that the cron-job runs on the 1st and 15th of each month, which is tomorrow, so should be ok to wait for a bit. Nice work on automating all of this!

Yes! you got it! so it should open a pr and then everything will get updated.

i forget if we already talked about this - but do you (or anyone else involved in this review including other maintainers!!) have any interest in joining our slack? if so i can send invites to anyone who's interested.

I'll personally pass for the Slack channel (have more than a dozen channels already...), but maybe someone else would be interested.

I am in the same boat, but please feel free to ping me for future reviews on github!