openstreetmap/chef

Planning database reload

kocio-pl opened this issue · 17 comments

After deploying OSM Carto v4.0.0, which introduced lua processing, we never tried database reload synchronized with a release changing some lua definitions. However it seems that using hstore is not enough for tags that are meant to be standalone (like healthcare), only for some that are added to some "main" tags (like iata/icao added for the airports), which results in some proper objects not being rendered at all.

When could you think of deploying new OSM Carto release combined with database reload to use changed lua transformations?

@kocio-pl This has to do with the polygon/linestring choice, right?

Yes, I guess it is, but if I understand correctly also filtering some tags.

I think the question for now should not be so much as a concrete request to do a database reload. I think first we would like to know if OWG has any database reload scheduled for other reasons (then we would like to know about) and second how much effort a database reload would be.

Also, if it is a lot of effort, we should probably present concretely what the gains from such an update would be, so OWG can make a judgement whether an reload is worth the effort :).

This is not a request, it's a question with a bit of background. And I treat it as the open one to start communicating about plans and needs.

Ok, then I think we are on one line!

It's only been about six months since the last reload and I'd rather not do them too often if we can avoid it as it's a lot of work.

Ideally we should be looking to batch up things that need a reload and do them all in one go...

I see. I think we have enough things to put in a batch already, but it also will need some work to prepare from our side.

Would it be reasonable to plan a database reload about 6 months from now, i.e. 12 months after the last reload, or is this too soon?

We have a new rendering server installed in Amsterdam as a replacement for orm, so this is probably a good time to think about making schema changes so that we can include then when doing the initial load on that machine and then think about doing a reload on other machines over the next month or two.

We have a new rendering server installed in Amsterdam as a replacement for orm, so this is probably a good time to think about making schema changes so that we can include then when doing the initial load on that machine and then think about doing a reload on other machines over the next month or two.

I don't think we should hold anything up. We don't have any code ready, so we need to write the code, test it, review it, and iterate on that. This process will involve multiple DB loads in testing, so it will take longer than getting the new system running.

We haven't discussed a timeline, but I want 4.21 to be released before starting a new branch.

We now have a number of changes ready, but there are still a few more to discuss. We could be ready in a month or two, I believe.

Perhaps a database reload could be planned for the end of the year in December 2019, or in January 2020?

Thoughts on planning a database load for early January 2020? Would that be a good time?

I think the planning should be the other way around - we should release a new MAJOR branch of osm-carto when we feel its ready, and then osm.org can reload at an appropriate time.

Do we have enough disk space to load into a separate db schema, and flip schemas later on? osm2pgsql doesn't support different schemas yet, but it would be a good use case to add it.

Depends on the machine but at least some of them won't have enough space.

It's not necessary any way as there's no problem taking the machines out one at a time.

Schemas also aren't necessary as you could just load into a different database - schemas are only advantageous if you want to query across tables from both.

The reload will be done into postgres 12 anyway so it will definitely be a different database, indeed a different cluster.

It appears we are nearly finished merging and discussing the various PRs related to this issue at Openstreetmap-carto, so it is possible that the next release, in a month, could contain the changes requiring database reload (https://github.com/gravitystorm/openstreetmap-carto/tree/schema_changes)

We got the answers to the questions, and 5.0.0 has been released so I think this can be closed.