pysal/libpysal

libpysal.graph roadmap to release

Opened this issue ยท 28 comments

I tried to outline what is missing before we could cut a release with the new graph stuff. Open to a discussion.

testing

We have a decent coverage of base but especially constructors are not always tested for correctness. So we test the API but not that the adjacency captures what it should. So missing tests:

  • test correctness of contiguity builders #566
  • test correctness of triangulation builders
  • test correctness of kernel builders
  • test all kernel parameters
  • test all query options (both sklearn and scipy)
  • test precomputed distance matrix support (indirectly via triangulation and distance band already covered)

implementation/checks

Some bits are still waiting for implementation (I have likely missed stuff here) or a double check that the current implementation works as intended.

  • isolates in kernel and knn
  • co-located in knn
  • knn haversine checks
  • fix cliques
  • set operations
  • draft API and raise NotImplementedError?

discussion

A few things I'd like to discuss (during the dev call in 3 minutes?)

  • default kernel - W has triangular, we now have gaussian following scipy model - do we want to warn? how?
  • expose kernels in transform? kernels are essentially transformations. Shall we allow to use them post-creation?
  • how should set operations work? See #561

follow-up

Stuff that is not implemented or started but that can likely wait for subsequent release, not the initial one. Below is the list of stuff I'd like to see soon but there's more to be done.

  • lag_categorical (inside lag()?)
  • classic I/O (we have custom Parquet now)
  • plotting #593
  • fuzzy contiguity #564
  • other utils - any priorities?
  • optimal sparse formats (coo may be slow) #697
  • IDs handling cleanup #626
  • symmetrize (see #664)
  • fill diagonal #668
  • higher-order neighbors
  • trace

documentation

API reference should be in the docs for the first release. user guide can follow later.

  • API reference
  • user guide
  • migration guide

Some of the follow-up could be piped via W for the time being if we want them available.

Can you drop a note if you are planning to work on any of the topics mentioned above so we don't overlap?

ljwolf commented

I am unlikely to have time until sept 27

I think Iโ€™ll go back to ensuring the constructors can consume sparse next

With #577, we're ready for the initial experimental release of Graph.

  • cliques are not fixed, but they raise NotImplementedError at the moment.
  • precomputed matrix is not tested explicitly but a few of our internal functions use it, so it is tested via those.
  • API is not yet drafted, but that does not need to be for a release
  • The default kernel is still Gaussian
  • set operations are implemented with some restrictions ensuring the validity of the Graph and its alignment with original data. See #575 for details.

Anything else you'd like to get in?

I would target the end of the week at the latest to cut 4.8.0 (need it alive on conda on Tuesday)

Anything blocking the release now that #577 is merged?

@jGaboardi any idea if the current infrastructure works as intended? It has not been updated here.

I am not sure. Let's give it try; perhaps as a 4.7.0.post1 release?

More like 4.8.0rc1 given the status of current main

Documentation is okay - https://pysal.org/libpysal/

Changelog is completely broken

Yeah, looks like the action still uses tools/gitcount.ipynb. This will need to be looked it if we still want to use it.

spaghetti has been using actions/github-script@v6 with success, if we want to try here.

lets update the root and get rid of versioneer etc before doing a real release

@knaaptime Are you on it or shall I?

i'll take the first pass

Unless anyone does that first, I'm happy to cut 4.8.0 later tonight (Prague time) or tomorrow morning.

Unless anyone does that first, I'm happy to cut 4.8.0 later tonight (Prague time) or tomorrow morning.

Seem to have a strange merge problem with upstream/main vs. my main. Both docs build and publish failed. Going to delete the v4.8.0 tag and try to figure it out.

Seems like two key pieces slipped through the cracks:

  1. Updated method for install deps for creating release.
  2. This line seems to be missing in libpysal/docs/conf.py

I'll put in the necessary PR shortly.

we dont need the sys path hack

i think we're using an old recipe for the docs. It should have an install line before making the docs (so you get the real version without doing the sys/path hack). I think there's a current version in tobler

we dont need the sys path hack

Seems to fail without it. Is there something I am missing? Yes, there was. LOL

@knaaptime @martinfleis

Can confirm docs build locally, but with the following warnings:

libpysal/libpysal/cg/shapes.py:docstring of libpysal.cg.LineSegment.bounding_box:1:<autosummary>:1: WARNING: Inline strong start-string without end-string.
libpysal/libpysal/cg/shapes.py:docstring of libpysal.cg.LineSegment.bounding_box:1:<autosummary>:1: WARNING: Inline strong start-string without end-string.
libpysal/docs/notebooks/fetch.ipynb:4: WARNING: Each notebook should have at least one section title
libpysal/docs/notebooks/io.ipynb:4: WARNING: Each notebook should have at least one section title
libpysal/docs/notebooks/Raster_awareness_API.ipynb: WARNING: document isn't included in any toctree
libpysal/docs/notebooks/fetch.ipynb: WARNING: document isn't included in any toctree
libpysal/docs/notebooks/io.ipynb: WARNING: document isn't included in any toctree
  • The first two, I can't find anything to actually fix after finding some possible solution.
  • The second two are "whatevers".
  • For the final three, are we not including those in tutorial.rst on purpose? If so, then nothing to fix.

I we are OK with that, I will push up the fixes immediately.

Release failed, I think because there is no README.md in libpysal... it's README.rst.

@knaaptime @martinfleis

v4.8.0rc2 is tagged and up on PyPI with docs built successfully. Please give a once over. If everything looks kosher, I will cut v4.8.0.

All looking good to me!

Do you or Eli want to do the honors since yall put in that hard work? Or shall I go ahead and cut it?

Go ahead ๐Ÿ˜‰

The decision on ID API has been to allow passing an array of IDs when the input is a sparse or dense precomputed array and raise a ValueError when it is passed alongside a data frame.