Los Angeles County building and address import
Generates an OSM file of buildings with addresses per census block groups, ready to be used in JOSM for a manual review and upload to OpenStreetMap. This repository is based heavily on the NYC building import
This README is about data conversion. See also the page on the OSM wiki.
You may want to browse the issues and/or "watch" this repo (see button at the top of this page) to follow along with the discussion.
Sample .osm files (not ready for import yet) are in this zip file.
Browse a slippy map of the data here
Python 2.7.x
pip
virtualenv
libxml2
libxslt
spatialindex
GDAL
# install brew http://brew.sh
brew install libxml2
brew install libxslt
brew install spatialindex
brew install gdal
apt-get install python-pip
apt-get install python-virtualenv
apt-get install gdal-bin
apt-get install libgdal-dev
apt-get install libxml2-dev
apt-get install libxslt-dev
apt-get install python-lxml
apt-get install python-dev
apt-get install libspatialindex-dev
apt-get install unzip
# may need to easy_install pip and pip install virtualenv
virtualenv ~/venvs/labuildings
source ~/venvs/labuildings/bin/activate
pip install -r requirements.txt
Run all stages:
# Download all files and process them into a building
# and an address .osm file per district.
make
You can run stages separately, like so:
# Download and expand all files, reproject
make download
# Chunk address and building files by census block group
# (this will take a long time)
make chunks
# Generate importable .osm files.
# This will populate the osm/ directory with one .osm file per
# census block group.
make osm
# Clean up all intermediary files:
make clean
# For testing it's useful to convert just a single district.
# For instance, convert block group 060372735024:
make chunks # will take a while
python merge.py 060372735024 # Should be fast
python convert.py merged/buildings-addresses-060372735024.geojson # Fast
- Cleans address names
- Exports one OSM XML building and address file per LA county block group
- Conflates buildings and addresses (only when there is one address point inside a building polygon)
- Exports remaining addresses as points (for buildings with more than one address, or addresses not on a building)
- Handles multipolygons
- Simplifies building shapes
See the convert.py
script to see the implementation of these transformations.
-
AIN - Parcel this address falls inside
- Ignore (although
merge.py
uses this to map stray addresses to buildings)
- Ignore (although
-
NumPrefix - Number prefix
- Note: These are extremely rare (only 33 of them), mainly showing up for addresses in Lakewood Center Mall, Lakewood CA. Handling these would require treating
addr:housenumber
as a string, not an integer. However, the OSM wiki says this is permitted. - Prepend to
addr:housenumber
usingformatHousenumber()
function
- Note: These are extremely rare (only 33 of them), mainly showing up for addresses in Lakewood Center Mall, Lakewood CA. Handling these would require treating
-
Number - House Number
- Map to
addr:housenumber
usingformatHousenumber()
function
- Map to
-
NumSuffix - House Number Suffix (1/2, 3/4 etc)
- Append to
addr:housenumber
usingformatHousenumber()
function
- Append to
-
PreMod - Prefix Modifier
- Examples: OLD RANCH ROAD, LOWER ASUZA ROAD
- Change to titlecase
- Prepend to
addr:street
-
PreDir - Prefix Direction (E, S, W, N)
- Examples: SOUTH RANCH ROAD
- Note: the data is already expanded into "NORTH", "SOUTH", etc. We do not condense into "N", "S".
- Change to titlecase
- Prepend to
addr:street
-
PreType - Prefix Type (Ave, Avenida, etc)
- Examples: NORTH VIA SORRENTO, RUE DE LA PIERRE
- Change to titlecase
- Prepend to
addr:street
-
StArticle - Street Article (de la, les, etc)
- Examples: RUE DE LA PIERRE
- Change to titlecase
- Prepend to
addr:street
-
StreetName - Street Name
- Change to titlecase (but full lowercase on numeral suffixes "st", "nd", "rd", "th")
- Map to
addr:street
-
PostType - Post Type (Ave, St, Dr, Blvd, etc)
- Change to titlecase
- Append to
addr:street
-
PostDir - Post Direction (N, S, E, W)
- Examples: MARINA DRIVE SOUTH
- Note: the data is already expanded into "NORTH", "SOUTH", etc. We do not condense into "N", "S".
- Change to titlecase
- Append to
addr:street
-
PostMod - Post Modifier (OLD, etc)
- Note: this is always null in the current data. Treat like PreMod for consistency.
- Change to titlecase
- Append to
addr:street
-
UnitType - Unit Type (#, Apt, etc) - where these are known
- Ignore?
-
UnitName - Unit Name (A, 1, 100, etc)
- Map to
addr:unit
- Map to
-
Zipcode - Zipcode
- Map to
addr:postcode
- Map to
-
Zip4 - Not currently filled out in source data
- Ignore
-
LegalComm - Legal City or primary postal city in Unincorporated Areas
- Fall back to this if
PCITY1
is null. - Potentially could map this to
is_in:city
, but given that OSM already has good city boundaries this seems unnecessary.
- Fall back to this if
-
Source - source of the address point, one of: Assessor, LACity, Regional Planning, other
- Ignore: this generally corresponds to whichever city the address falls within
-
SourceID - ID of the Address in the source system
- Ignore
-
MADrank - Method Accuracy Description (MAD) provides a number between 1 and 100 detailing the accuracy of the location.
- Ignore
-
PCITY1 - 1st postal city (from the USPS)
- Note: this is null for 1469 records. When null, fall back to
LegalComm
. - Change to titlecase
- Map to
addr:city
- Note: this is null for 1469 records. When null, fall back to
-
PCITY2 - 2nd postal city (from the USPS)
- Ignore: mostly null or same as LegalComm. Always null when
PCITY1
is null.
- Ignore: mostly null or same as LegalComm. Always null when
-
PCITY3 - 3rd postal city (from the USPS)
- Ignore: mostly null. Always null when
PCITY1
is null.
- Ignore: mostly null. Always null when
-
CODE - Building type (either Building or Courtyard).
- Ignore
- Note: only CODE='Building' is used for this import. We ignore CODE='Courtyard'. This filtering step happens in the
Makefile
-
BLD_ID - unique building ID
- Ignore ???
- Or, map to a special OSM tag like
lacounty:bld_id
-
HEIGHT - the height of the highest major feature of the building (not including roof objects like antennas and chimneys)
- Convert from inches to meters
- Round to one decimal place
- Map to
height
tag, only if height > 0
-
ELEV - the elevation of the building
- Convert from inches to meters
- Round to one decimal place
- Map to
elevation
tag, only if elevation > 0
-
AREA - the Roof area
- Ignore: mostly null
-
SOURCE - the data source (either LARIAC2, Pasadena, Palmdale, or Glendale)
- Ignore
-
DATE - Date Captured (2006, 2008, or blank)
- Ignore
-
AIN - the Parcel ID number.
- Ignore (although
merge.py
uses this to map stray addresses to buildings)
- Ignore (although