osmlab/labuildings

Split import workflow into two stages: building footprints and address

Closed this issue · 6 comments

This is a great test case on how to execute a 'proper' building and address import by conflating with existing OSM data.

Looking into the data there are mainly two features:

  • Building footprints (changes over time with new developments)
  • Address (Does not change unless lots are merged,split)

Since the main blocker is conflating overlapping address points in the dataset #9 how about proceeding with the building footprints in the first stage, and bring in the address as the next stage?

Since the data is 8 years old, it is possible that a number of buildings have changed compared to the imagery, but the address might be constant. The amount of manual effort is quite hard and breaking it down into simpler stages will help the mappers.

cc @almccon

Tried out manually importing the building outlines next to UCLA a week ago:

screenshot 2015-12-18 19 13 47

- Preparation: - Reproject shp to EPSG:4326 and open in JOSM using the open data plugin - The entire dataset had to be shifted slightly as there was a consistent offset with the Bing/Mapbox imagery - Define import area: I chose the area shown above. - Pick import candidates: Analyze footprint with imagery and choose those that are over a 95% match. Only 1 in 40 were dropped as the quality is pretty high. - 2 in 5 dropped buildings were silvers

screenshot 2015-12-18 19 29 40

- 2 in 5 dropped buildings had a new development in the imagery - 1 in 5 dropped buildings had no visible structure in the imagery - Copy the import candidates to active OSM layer - Run validator to check for overlapping buildings - Inspect overlaps and replace geometry or drop the candidate - Inspect import area and for a visual check and trace any missing buildings

It took around 12 minutes to add 220 buildings. With better tooling for selecting import candidates and conflating data this can be significantly improved. Only the building outlines were copied, all other metadata was scrubbed.

https://www.openstreetmap.org/#map=17/34.07190/-118.43579

I like the idea of splitting it up. This project really needs to get rolling. But the assessor data and meta tags I think are important, like building=apartments. That'll be nice down the road. You also have start_date and units.

Also meant to say that the county won't release 2015 building footprints until sometime in 2016. They have some weird contract thing that prevents them from making it public until then. But there will likely be a change file.

Based on observations from sample data, there are a bunch of issues with importing the addresses cleanly as an attribute of the footprint:

  • Addresses match a garage or a building part #26 (comment)
  • Addresses mostly don't match a building #26 (comment)
  • Address matching is random when there are multiple legitimate addresses for a single footprint #26 (comment)
  • Multiple addresses at a single point #26 (comment)

It does not seem like there is an easy way to sort these issues in the short term without a lot of manual work. If we dont want this to block us, we would need to punt on addresses for now and focus just on the footprints+assessor data.

@jschleuss @almccon @talllguy @maning thoughts?

Does it make sense to separate the geojsons for building and address during chunk? This means we don't run the merge and just straight away convert to geojson instead of shapefile. We can then process address data with convert.py if we import at a later stage.

Nothing actionable. Let's reopen if we decide to do address at a later stage.