GIScience/openpoiservice

temp: collect observations

nilsnolde opened this issue · 2 comments

Let's create separate issues from this list once we discussed them a little bit @zephylac. If you'd like to be involved too, we can see which ones who tackles:

  • support newest Python versions 3.7 & 3.8. Problem I could figure out so far: gunicorn needs to be updated to >=19.9.0, maybe ,<20.0.0, as there's been conflicts from 3.7 on (#79)
  • move away from imposm since deprecated since a long while (#80)
  • Update project to work on newest packages, most notably Flask>=1.0, which also means Werkzeug>=1.0 (#81)
  • Dockerfile and template docker-compose.yml can be improved:
    • Use a Python base image
    • has a lot of unnecessary RUN statements making the final image huge
    • Use configs and OSM file as build ARG and keep exposing via volumes, so that we could host the image on dockerhub/other registry, see ORS docker as example (#82)
  • take tests out of the main package: can have very undesirable side effects otherwise as the root __init__.py is executed when running the tests (#83)
  • more proper error codes: right now all is 500, even when most are user input errors, i.e. rather the 4xx family (#84)
  • add test for MulitPolygon API request
  • make README more readable in terms of structure etc
  • area checks to determine if the request exceeds the server limits are done in EPSG:3857, which should ideally be a global equal-area projection like Mollweide
  • the response type had a breaking change when you introduced MultiPolygon: now they're all lists of GeoJSON FeatureCollection where I don't really understand the point of. It's all the same geometry.type, so a list of FeatureCollection is pretty much an anti-pattern. Why not wrap all features for all polygons (if MultiPolygon) in a single FeatureCollection? If you really need to relate the polygons of a MultiPolygon to the index in the response list (only reason I can think of) then doing a multipart to singlepart conversion and requesting with single Polygons makes more sense to me. Can we please revert that again?
  • remove all user-settable files from git and provide maintained template versions of them: so we don't accidentally commit our passwords and stuff
  • slight changes to make the project more accessible: move ops_settings.yml to the root of the main package, remove server folder (redundant when tests are moved)

Optional:

  • use a proper dependency manager: I'm in love with Poetry. Super easy to use and million times better than any other.
  • Try to incorporate tox to test multiple Python versions easily
  • Introduce linter, e.g. yapf
  • improve testing framework: right now, every single test will create dbs, import data and drop dbs (#96)
  • lots of data duplication in the request handlers in views.py, which could be avoided, as that's a potential performance drain

I started a project: https://github.com/GIScience/openpoiservice/projects/1 and started to take out a few issues from above list.

The high priority column is what I think should be tackled first as it defines the whole project, ordered with in descending priority.

I created a new branch v1.0-release where we can collect all the improvements over time.. Tests are not necessarily expected to pass before we finally merge into master