PostgreSQL BulkMerge / BulkUpdate always returns all rows as update
cupper1 opened this issue · 6 comments
Description
This seems very simple but somehow I am lost, so sorry if this is simple!
I have a simple table 2 fields ID and FirstName.
I have a json file that has 500 records of ID and Firstname,
I use jsonconvert to pull it into list of objects.
I then do this:
try
{
string filecontents = File.ReadAllText(filenametopull);
List<LARCObjects.Patient> patients = new List<LARCObjects.Patient> { };
patients = JsonConvert.DeserializeObject<List<LARCObjects.Patient>>(filecontents);
if (patients.Count > 0)
{
List<AuditEntry> auditEntries = new List<AuditEntry>();
DapperPlusManager.Entity<LARCObjects.Patient>().Table("Patients2")
// .Identity(x => x.PatientID)
.Key(x => x.PatientID)
.UseBulkOptions(x =>
{
x.AuditEntries = auditEntries;
x.UseAudit = true;
x.InsertIfNotExists = true;
});
atrun.totalrows = patients.Count;
using (var connection = new NpgsqlConnection(configuration["pconnect"]))
{
connection.BulkMerge(patients);
//connection.BulkUpdate(patients);
}
}
on first run, all goes well, and when reviewing auditentries, it shows 500 rows all with "insert", so that makes sense.
on second run, of exact same file, everything runs the same, no errors, but then when reviewing auditentries, I have 500 rows all with "Update", and when reviewing no difference in newvalue / oldvalue.
But there is an additional field added called "xmin" which seems an internal field used by postgres.
in this field for every record, there is a slight difference (like oldvalue: 1403 , newvalue: 1502)
Is this the expected behavior? I would have expected 0 new rows from the auditentries as nothing has changed.
Further technical details
- Dapper version: [Dapper v2.1.28]
- Dapper Plus version: [Dapper Plus v7.2.2]
- Database Server version: [PostgreSQL 16.0 (Debian 16.0-1.pgdg120+1) ]
- Database Provider version (NuGet): [npgsql 8.0.1]
Hello @cupper1 ,
The BulkMerge
inserts data that doesn't exist and updates those that already exist.
So it's expected that 500 rows will be returned on the second run as 500 rows have been updated (already existing).
However, there is 2 small errors in your code:
- The
InsertIfNotExists
only work withBulkInsert
- The mapping should only be done once. In your case, you are probably better to use a mapping context: https://dapper-plus.net/getting-started-mapping#instance-context-mapping (you can map the same entity more than once using a mapper key as well)
Perhaps you were looking more for a BulkInsert
with the InsertIfNotExists
option?
NOTE: The Xmin
is an internal column in PostgreSQL.
Let me know if I answered correctly your question
Best Regards,
Jon
Hi @JonathanMagnan Thanks for the response, a follow up:
so, on the second run, where all the data is exactly the same, no data was actually updated, so I would have expected the audit log to show 0 entries, as nothing was changed.
Maybe I am going down the wrong path, and is best to explain the use case:
I get the file every day, it has 500 (or 15k) rows in it, almost all the data is exactly the same. Right now I query the database for each row, bring it back, check it against the row in the file and if there is a difference, I update it, if not I move the the next row.
I was expecting that is what this would do for me, and give me an audit log of any rows that have changed. Is this capable of that?
for example, 500 rows from the exact same file has no changes, so audit log would show 0 rows.
tomorrow 1 row is updated with a new firstname, so audit log shows 1 rows, oldvalue / newvalue
if not, is there a way of filtering the audit log where oldvalue !=newvalue ?
Thank you again for the response, I am hoping this product will do this as its much faster than 1 by 1 :)
Hello @cupper1 ,
If I understand your scenario, we indeed support it. You can use either the option MergeMatchedAndOneNotConditionExpression
or IgnoreOnMergeMatchedAndOneNotConditionExpression
to achieve this behavior. You can learn more about those option here (Documentation is about Entity Framework Extensions but it's the same behavior for Dapper Plus)
See the following online Fiddle: https://dotnetfiddle.net/3lyezo
DapperPlusManager.Entity<Patient>("MergeMatchedAndOneNot").Identity(x => x.PatientID)
.UseBulkOptions(options => {
// Use this line if you want to specify all properties. One of them must be different for the "Update" part to be executed
//options.MergeMatchedAndOneNotConditionExpression = x => new { x.Name, x.Description };
// Use this line if you want to include all properties by default (we specify the key as you can always skip this one, will never be different)
options.IgnoreOnMergeMatchedAndOneNotConditionExpression = x => new { x.PatientID };
});
You will notice on my Fiddle:
- I used a
mapper key
as I don't want this behavior to be the default behavior when I save aPatient
- I added the audit in the
UseBulkOptions
in the connection instead. So I can separate the global mapping from his custom logic - I used the
mapper key
previously created to specify to theBulkMerge
to use this mapping
As you can see in the result:
- On the first run, 2 audit entries has been returned (2 rows inserted)
- On the second run, 0 audit entry has been returned (nothing has been updated since nothing is different)
- On the third run, 1 audit entry has been returned as I modified the description for 1 row
Let me know if that's what you were looking for.
Best Regards,
Jon
THANK YOU! yes this is (almost) exactly what I am looking for! and it works great, I just would like to ask one more question. In the example, the auditentries shows 1 row, but all fields still. So for instance if my schema had 10 columns, and only one changed, it returns all 10 columns with old/new fields. I am just curious if you have another magic option to only return the fields that changed instead?
Thank you again for this!
Hello @cupper1 ,
You currently have the OldValue
and NewValue
.
So, for this one, you will have to make a filter on your side.
Something like this:
auditEntries.ForEach(x => x.Values = x.Values.Where(y => y.OldValue.ToString() != y.NewValue.ToString()).ToList());
Let me know if that works on your side.
Best Regards,
Jon
Thank you again @JonathanMagnan really appreciate you!