BayAreaMetro/bayarea_urbansim

Preprocessing issue when jobs exist but no buildings

Opened this issue · 4 comments

This creates an Issue for @smmaurer's findings in this PR: #142, copied below:
Due to changes in the base data, I had to loosen a requirement in the allocate_jobs() helper function that's used in the preproc_jobs step.
The baseyear_taz_controls.csv file lists job counts for each TAZ, which are allocated to specific buildings using this code. If there are jobs but no buildings for a particular TAZ, allocate_jobs() raises an error and crashes.
This is now happening for a couple of TAZ's. I modified the code to log the problem and move on, which lets the remainder of the preprocessing complete. The mismatches will need to be resolved on the data side (see separate discussion in Slack).

The temporary fix prevents preprocessing from crashing, but we'll need to understand these cases where jobs exist but no buildings for them, so that the jobs can be allocated if necessary.

@smmaurer do you remember what changes in the base data you were referring to here? Our jobs/buildings/households data hasn't been updated yet, but it might be useful for us to track when this preprocessing step broke (ie what changes occured) since it did work before.

I also don't seem to be getting any returns of ERROR in TAZ when I run preprocessing, so I wonder if our data was different for some reason. I might propose keeping the else functionality that you added so that the code still runs in these cases, but re-enabling the assertion so that it crashes.

Here's the log with the original error I was getting in the preproc_jobs step: run23.log

And here's the log after the fix in PR #142: run24.log

I was running python baus.py --mode preprocessing with a clean copy of the current files on Box.

The thing I wrote about "changes in the base data" was speculative -- after I looked into it more, the issue seemed to stem from baseyear_taz_controls.csv, which hasn’t changed since 2016.

Quoting from Slack: "Looking at that file, TAZ 1439 has 0 housing units and all the jobs are in the same sector, so makes sense that this would be San Quentin as Mike said on the call. TAZ 353 has 1843 units, though, and jobs in a mix of sectors, so it’s more of a mystery why these can’t be allocated."

So based on what you're seeing, maybe it's some kind of environment-specific issue?

(Impressed that you were about to find these logs).
When I roll back to this commit, right before your changes, I also get the same error that you posted in run23.log. The TAZ info looks like this, which generates the weights error due to their being no weights:
TAZ 353: 190 jobs, 0 potential locations

However using the current code, all jobs are assigned both when I used your code and remove it. The TAZ info looks like this:
TAZ working 353: 190 jobs, 1 potential locations
TAZ working 1439: 954 jobs, 1 potential locations

I haven't figured out why there's now a building available, since this script should be reading from the same H5 buildings table. When I pull a fresh h5 it also assigns all of the jobs. I'll re-implement the assertion but should come back to this.