Catch incorrect unit totals for condos
Opened this issue · 6 comments
Can we programmatically change incorrect condo unit count data without erroneously changing valid data? If not, we can create a report with candidates for correction and add manual corrections.
The pseudo code below might catch the worst offenders, hopefully without picking up false positives.
For unit BBLs having the same billing BBL, if more than 5 unit BBLs have the same value for coop_apts, and that value is not a 0 or 1, the resunits for the billing BBL should be set equal to that value.
The number 5 is arbitrary and could be adjusted (we didn't need to find all the issues, just the ones that most impact the totals). We'd need to create and review a report with condos that get corrected.
Determined cannot make programmatic change. Need to output QAQC tables to help with manual research.
-
All PLUTO records where there is a match in Housing Database and the PLUTO residential units value does not match the housing database certificates of occupancy value. Have flag indicating if bbl has residential unit correction in manual corrections table.
-
All PTS condo records where:
- The units and coop_apts values are greater than one for the billing bbl AND
- Two or more unit bbl records have units and coop_apts values greater than 1
Include the CO value from the Housing Database if there is a match. This if matched on "prime bbl" value will be duplicated if matched on "PTS bbl" value will only appear for '75' tax lot.
Include a flag indicating if bbl has residential unit correction in manual corrections table.
Logic for addressing first report described above
-- select PLUTO records that have a match in the HousingDB subset where unitsres does not equal units co
WITH base as(
SELECT DISTINCT round(a.bbl::numeric,0)::text as bbl,b.job_number,a.unitsres,round(b.units_co::numeric,0)::text as units_co
FROM pluto_22v3 a, housingdb_post2010 b
WHERE b.bbl||b.datelstupd IN (
-- get the most recent DOB record for a BBL based on date of last update field
SELECT bbl||max(datelstupd::date) maxDate
FROM housingdb_post2010
GROUP BY bbl)
AND round(a.bbl::numeric,0)::text=b.bbl
AND a.unitsres<>round(b.units_co::numeric,0)::text),
-- select only corrections to unitsres field
corrections_subset as (
SELECT *
FROM pluto_corrections_22v3
WHERE field='unitsres')
-- combine PLUTO, DOB, and corrections data for final output
SELECT a.*, c.new_value, c.old_value
FROM base a
LEFT JOIN corrections_subset c
ON a.bbl=c.bbl;
LS: join on the PTS bbl. It will make the output easier to read.
Logic for creating second report
-- get PTS records that meet the criteria of
-- The units and coop_apts values are greater than one for the billing bbl AND
-- Two or more unit bbl records have units and coop_apts values greater than 1
WITH pts_subset as (
SELECT primebbl, bbl, units, coop_apts
FROM pluto_rpad_geo
WHERE primebbl IN (
SELECT primebbl FROM (
SELECT primebbl, COUNT(*)
FROM pluto_rpad_geo
WHERE tl NOT LIKE '75%'
AND RIGHT(primebbl,4) LIKE '75%'
AND units::integer > 1
AND coop_apts::integer > 1
GROUP BY primebbl, units, coop_apts) as badbases
WHERE count>1)
AND primebbl IN (
SELECT primebbl FROM (
SELECT primebbl
FROM pluto_rpad_geo
WHERE tl LIKE '75%'
AND units::integer > 1
AND coop_apts::integer > 1) as badbillings)),
-- get the most recent DOB record for a BBL based on date of last update field
dob_subset as (
SELECT * FROM housingdb_post2010 b
WHERE b.bbl||b.datelstupd IN (
SELECT bbl||max(datelstupd::date) maxDate
FROM housingdb_post2010
GROUP BY bbl)),
-- select only corrections to unitsres field
corrections_subset as (
SELECT *
FROM pluto_corrections_22v3
WHERE field='unitsres'),
-- Join PTS and DOB subsets, preserving all PTS records
pts_dob as (
SELECT a.*, b.job_number, round(b.units_co::numeric,0)::text as units_co, b.datelstupd
FROM pts_subset a
LEFT JOIN dob_subset b
ON a.bbl=b.bbl
AND a.coop_apts::text<>round(b.units_co::numeric,0)::text)
-- Join on corrections to produce final output
SELECT a.*, c.new_value, c.old_value
FROM pts_dob a
LEFT JOIN corrections_subset c
ON a.bbl=c.bbl;
@AmandaDoyle where is the housingdb_post2010
getting pulled in from? Is that just dcp_housing
?
Noting potential enhancements based on LS feedback. If greenlit, these will become separate issues.
- For records with a large difference between resunits and co_units in PLUTO vs DevDB, create report that outputs PTS records.
- Add column to PLUTO vs DevDB that computes the difference between resunits and co_units
- Add field to PLUTO vs DevDB to indicate if it's a condo
- PLUTO vs DevDB - report all PLUTO records that have a BBL in DevDB and discuss with Housing and EDM the use cases of this report