PublicMapping/districtbuilder

Make TopologyService refuse to load archived RegionConfigs

KlaasH opened this issue · 0 comments

KlaasH commented

The server is still doing this:

[Nest] 28  - 06/22/2023, 1:04:35 PM   ERROR [HealthCheckService] Health Check has failed! 
{
    "topology": {
        "status": "down",
        "total": 62,
        "complete": 61,
        "pending": [],
        "loading": [
            "s3://districtbuilder-production-data-us-east-1/regions/US/PA/2021-10-27T03:18:31.591Z/"
        ]
    }
}

That RegionConfig is known to be broken, and it shouldn't get loaded because it's marked archived. But something obviously still has a reference to it and is pulling it in. That can happen because TopologyService.get() allows for adding a RegionConfig to the list.

I'm assuming that's right and useful in some cases, to reload ones that got evicted from the cache, but in this case it's bad. Adding a broken layer to the list causes the workers to fail health checks and get constantly cycled.

There shouldn't be anything trying to load that RegionConfig, and in fact we fixed an issue like this before, but something is still doing it. It would be good to find out what, but in the meantime I think it makes sense to try to convert this from a "brings down the site" error by making TopologyService.get() refuse to add archived layers to the list. As a bonus, having it log an error and return undefined instead might make it easier to figure out where the bad code is that's still trying to rely on live loading of archived configs.