chekos/RIPA-2018-datasette

Create unique_id

Closed this issue · 0 comments

The original dataset includes DOJ_RECORD_ID and PERSON_NUMBER which uniquely identify the event and the people involved, respectively.

At the moment, for ripa-2018-db.herokuapp.com a UNIQUE_INDEX was created combining DOJ_RECORD_ID and PERSON_NUMBER. However, this is a 22 character string.

A numeric unique id would be much less costly in terms of memory. Both DOJ_RECORD_ID and PERSON_NUMBER would remain in the database in the base table (to be renamed). This unique id would only be used to connect tables uniquely identifying each row (even-person pair).

It could potentially be as simple as enumerating each row.