toddwschneider/agency-loan-level

duplicate key value violates unique constraint

Closed this issue · 3 comments

Ran your script with all the data available from both sources and received the following error:

psql:d:/analytics/db_scripts/populate_freddie_from_raw.sql:78:
ERROR: duplicate key value violates unique constraint "index_loans_on_seq"
DETAIL: Key (loan_sequence_number, agency)=(F100Q1024447, 1) already exists.

I'm not well versed in SQL to make this run. Does anyone have an idea on a solution?

You are likely referring to a transaction table which holds monthly payment and amortization history of each loan by indexing on loan seq number you create duplicates I would suggest indexing on loan seq number as well as reporting or period date that will create a unique record

Sent from my iPhone

On Jun 17, 2015, at 7:25 PM, wiitendo notifications@github.com wrote:

Ran your script with all the data available from both sources and received the following error:

psql:d:/analytics/db_scripts/populate_freddie_from_raw.sql:78:
ERROR: duplicate key value violates unique constraint "index_loans_on_seq"
DETAIL: Key (loan_sequence_number, agency)=(F100Q1024447, 1) already exists.

I'm not well versed in SQL to make this run. Does anyone have an idea on a solution?


Reply to this email directly or view it on GitHub.

@wiitendo that's strange, it means that you're trying to load the same loan into the loans table twice, which should never happen unless you're processing the same file twice, or the input file from Freddie Mac has an error. When I loaded the data I did not get this error, and I've confirmed that F100Q1024447 is in my dataset

As a short-term fix, you can remove the unique index, either by commenting out this line and rerunning the script, or by running DROP INDEX index_loans_on_seq; and picking up where you left off. You might end up with some duplicate records though in the loans table, so you should probably check when you're done and remove dupes. You also might want to contact Freddie Mac to ask them to check if they've accidentally included dupes.

Thanks gents!

I as able to get past the data import issues with the color you provided. In the end, the error was in my batch file I wrote to that wasn't looping properly.