walmartlabs/cookie-cutter

Bigquery support insertId for best effort deduplication

elpenao opened this issue · 1 comments

Big query has an option to send in an insertId for it to do best effort deduplication. Can we add support for this in bigquery sink.

//-
// Insert a row as according to the specification.
//-
const row = {
insertId: '1',
json: {
INSTNM: 'Motion Picture Institute of Michigan',
CITY: 'Troy',
STABBR: 'MI'
}
};

const options = {
raw: true
};

table.insert(row, options, insertHandler);

More options as well
https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll#request-body

{
"kind": string,
"skipInvalidRows": boolean,
"ignoreUnknownValues": boolean,
"templateSuffix": string,
"rows": [
{
"insertId": string,
"json": {
object
}
}
],
"traceId": string
}

No longer needed, as per conversation with poster