osmlab/labuildings

How to divide up the tasks?

almccon opened this issue · 16 comments

It looks like the DC buildings import uses block groups, and NYC used city election districts. What do we think about using census tracts? Too large?

LA County has a link to download census tracts: http://egis3.lacounty.gov/dataportal/2011/07/19/census-tracts-2010/

The county also has census blocks, I think those are far too small. So, if we want census block groups (which are a size somewhere between blocks and tracts) we'd have to download that elsewhere.

Here are what the census tracts and buildings look like in Venice, CA:
screen shot 2014-10-11 at 7 39 43 pm

Census block groups are a nice size, especially when it comes to validation. Blocks are too granular and tracts can be quite large, especially where densities are low. Download them directly from the Census Bureau's TIGER page: http://www.census.gov/geo/maps-data/data/tiger-line.html

Agreed. Block groups were very easy to work with in the DC import. I would add that you could consider importing tracts if there are very few existing buildings or POIs in OSM to conflate against and you don't have a lot of manpower.

The block groups could be grouped within the 88 city boundaries, except for LA City. With LA City we can maintain them with the 35 community plans. Unincorporated county areas couldbe grouped within the supervisor groups. This way a team can be tasked to work on a city or a part of LA city.

Of note, I went to MaptimeLA's event yesterday, County GIS gentleman was there announcing the release of the 2014 Building Outlines. Unfortunately its not available to the public due to cost issues. Price is still not available.

Good plan using city boundaries in addition to block groups. I was thinking about issue #2 and the (apparent) low quality of address points, but maybe the quality varies by city. I haven't checked yet. But if some cities have better address points, we might want to proceed with them first.

Do you have a link for the LA City community plans?

Also, bummer about the non-free 2014 buildings. Did he give a sense of how much better the 2014 data is? Are we just talking about new buildings, or improved outlines for existing buildings?

Regarding the addresses we can look to the county assessor, they have primary addresses for the whole county and cities. The problem is you have to purchase the data, but luckily UCLA has the 2011 files for download, UCLA GIS. You can link the data, which includes addresses and use types (ie. residence, store, theater, church). Both the LCLROLLA that UCLA has and 2008 Building Outline files can be linked by APN (Assessors parcel number), its what I did to make the LA Building Age Map

data.lacity.org has them up, Community Plans

On LA County's GIS page, apparently 141k buildings have been modified, 42k new, 14k replaced.
GIS Portal: Building Outlines 2014

Here are the census block groups for the same area in Venice, CA that I screenshotted above. I agree that these smaller areas will be more manageable.

screen shot 2014-11-15 at 5 18 23 pm

But there are 6423 of them in LA County. That seems like a lot to keep track of. How many individual tasks did other recent buildings imports have?

I like @cityhubla's idea that these block groups could be organized by city or by community plan (in LA city). Would that necessitate separate projects in the tasking manager, or is there some way to have groupings within a single project?

@almccon NYC had 5300.

Okay then!

@almccon perhaps we should start with LA City as a separate project in the tasking manager. Each project after that could be done by County District. I could draw up a proposal detailing this. I push for LA due to the timing of their open data initiative this year. Since LA County has the 2014 dataset, we should attribute this upload with the 2008 dataset we have now, if and when the 2014 one is released, updating would be streamlined.

@cityhubla I am chunking the data based on block groups right now. That's the first step, so we'll have addresses and buildings divided up in small pieces, and then we can work on the next steps of merging the addresses and buildings together, and converting them from .shp to .osm.

Once we have all that figured out, we can prioritize the tasking however we like. If we want to prioritize the block groups in LA City first, that's totally fine. Maybe discussing the priorities can be a separate issue.

And this import will definitely be tagged with the date, but I was thinking we'd put the date tag on the changeset. But we should also discuss that, maybe in issue #3

I just noticed that many of the block group boundaries are messy, and cross right though lots of buildings. This means that the chunked files sometimes include duplicate buildings.

Here is a screenshot of two chunks overlapping each other (the darker buildings and parcels are in both chunked files):
screen shot 2014-12-15 at 12 23 34 pm

I assume this is going to be fine, since the people doing the manual imports will have to verify that they aren't overlapping any existing buildings anyway.

Also, here's a tileset I generated that allows browsing of the block groups (thick gray lines), parcels (thin lines), buildings, and address points. It's a lot of fun to browse around:

https://api.tiles.mapbox.com/v3/stamen.labuildings/page.html#17/33.99024/-118.46586

Yea, that looks messy, if we go with this, this should be highlighted per the upload strategy

Some sample data chunked according to block groups is here: https://github.com/osmlab/labuildings/blob/master/venice_chunks.zip

So, I've been moving ahead with census block groups, but there's still nothing stopping us from switching to tracts. We should decide this soon, though, before we get too far along.

The advantage of tracts is that they seem to be well aligned to streets, and therefore don't cross through buildings very often.

The advantage of block groups is that they are smaller and easier to work with. But their shapes are messier and they frequently cross through buildings. This will mean the community will have to do more work to avoid conflicts when importing adjacent areas.

As @geobrando pointed out months ago, since we don't have very many addresses or buildings already in OSM, there will won't be many conflicts with existing data. So maybe tracts are an acceptable size? I'd like a few more opinions before we make a decision and close this issue.

Okay everybody, it was a false alarm about the crappy shapes of the census block groups. It turns out I was using an overly generalized file. In 44a719f I switched the Makefile to download the correct block group shapefile, and now it looks great.

screen shot 2014-12-30 at 30 dec 7 44 51
screen shot 2014-12-30 at 30 dec 7 47 28

Now I see no good reason not to choose block groups, so I'm closing this issue. Let's move forward with block groups. I will re-chunk the data and upload a new version of the venice_chunks sample file. I'll also upload the entire chunked dataset to S3.

Updated venice_chunks.zip in da95024