Duplicate key errors are silently swallowed and drop subsequent records
Closed this issue · 2 comments
schambon commented
From MongoBulkWriter lines 90 and following:
catch (com.mongodb.MongoBulkWriteException err) {
// Duplicate inserts are not an error if retrying
for (BulkWriteError bwerror : err.getWriteErrors()) {
if (bwerror
.getCategory() != ErrorCategory.DUPLICATE_KEY) {
logger.error(bwerror.getMessage());
fatalerror = true;
break;
}
}
if (!fatalerror) {
}
}
This means that any duplicate key errors are swallowed without being even logged. This can cause hard to track bugs in ETL processes.
This is made a lot worse as bulk writes are performed in ordered
mode - so if there is a duplicate key error, MongoSyphon ignores the error and skips all subsequent documents in the current batch - leading the used to believe the process has been successful when it actually inserted only a small portion of documents have been inserted.
johnlpage commented
That definately needs a fix of some kind. Even if just logging.
johnlpage commented
Changed to load ordered and to warn in log of duplicates.