dolthub/data-analysis

Overcrowded Prisons Analysis

timsehn opened this issue · 0 comments

We collected prisons data in this bounty database:

https://www.dolthub.com/repositories/dolthub/us-jails

In the jails table there is a num_inmates_rated_for and in the inmate_population_snapshots there are population snapshots.

This query is interesting:

$ dolt sql -q "select jails.facility_name, count(*) as periods_over_capacity from inmate_population_snapshots join jails on jails.id=inmate_population_snapshots.id where inmate_population_snapshots.total > jails.num_inmates_rated_for group by jails.id order by periods_over_capacity desc limit 20"  
+--------------------------------------------+-----------------------+
| facility_name                              | periods_over_capacity |
+--------------------------------------------+-----------------------+
| York Correctional Institution              | 269                   |
| Robinson Correctional Institution          | 269                   |
| New Haven Correctional Center              | 269                   |
| Hartford Correctional Center               | 269                   |
| Osborn Correctional Institution            | 269                   |
| MacDougall-Walker Correctional Institution | 269                   |
| Cheshire Correctional Institution          | 269                   |
| Brooklyn Correctional Institution          | 269                   |
| Corrigan Correctional Center               | 269                   |
| Bell County Central Jail                   | 268                   |
| Irving City Jail                           | 268                   |
| Hurst City Jail                            | 268                   |
| Garner Correctional Institution            | 268                   |
| Aransas County Detention Center            | 267                   |
| Willard-Cybulski Correctional Institution  | 267                   |
| Anderson County Jail                       | 267                   |
| Knox County Jail                           | 262                   |
| Radgowski Correctional Institution         | 262                   |
| Northern Correctional Institution          | 258                   |
| El Paso County Detention Facility - Annex  | 255                   |
+--------------------------------------------+-----------------------+

We know there are some data quality issues in this data. For instance, one of the bounty hunters concatenated instead of summed male and female inmates to make total which is why the top of this list is all 269 periods over. But, there's a kernel of cool analysis here and it would make a good blog.