Add upsert ability to SaveOperations
Closed this issue · 10 comments
Sometimes you want to save a record, sometimes you want to create a record. But there's also some cases when you're not sure which one you need, so an upsert
.
SaveUser.upsert do |op, user|
#...
end
The SQL would end up looking something like this using ON CONFLICT
INSERT INTO users (id, name) VALUES (1, 'Billy') ON CONFLICT DO UPDATE SET name='Billy';
I had an idea to get this 99% of the way there and maintain type safety! Here's the rough sketch for how I think it would work:
Add conflict_keys
macro to SaveOperation
The conflict_keys
macro would generate a upsert/upsert! and
find_or_create/find_or_create!` methods.
Here is roughly how you'd use it:
class SaveUser < User::SaveOperation
conflict_keys email
end
# If user is found with matching email then update, otherwise create
SaveUser.upsert!(params)
SaveUser.upsert(params) do |operation, user|
# same thing but yields like the create/update does
end
SaveUser.find_or_create!(params) # Finds (but does not update) or creates
SaveUser.find_or_create(params) do |operation, user|
# 'user' will be nil if not found *or* if the params are invalid and therefore could not create a new one
end
Here's roughly how it would be implemented
# Not sure about naming. This is just a rough idea
module Avram::ConflictKeys
macro conflict_keys(*attribute_names)
def self.upsert!(*args, **named_args)
operation = new(*args, **named_args)
existing_record = BaseQuery.new(
{% for attribute in attribute_names %}
{{ attribute.id }}: operation.{{ attribute.id }}.value,
{% end %}
).first?
if existing_record
operation.update!
else
operation.create!
end
end
# And then generate a similar method for `find_or_create`
end
# Add a method for documentation and for a nice compile time error if used before calling `conflic_keys`
def self.upsert!(*args, **named_args)
{% raise "Please call 'conflict_keys' on your operation before using upsert. <link to docs> %}
end
end
Some gotchas
This does have some race conditions :( The upside is that we can add this to Avram pretty easily without implementing ON CONFLICT quite yet. I think the race condition is extremely rare in most cases, and if you have a unique index on these your data won't ever get into a bad state anyway. The most you'll get is an error saying you had a non-unique column. So I think this is a great way to get started
Later on we could change the generated method to use ON CONFLICT
but that can come at a later point
Also worth noting that this does lock you in to one set of conflict keys per operation. After much research and thought I think that's ok. Typically you only use upsert/find_or_create in a few spots and the keys are typically the same. Secondly you could/should create a new operation if you want something with different conflict keys. Something like ImportUserFromCsv
would be a great candidate for using conflict_keys
I love this and would simplify some stuff I am working on. I like the interface and how it mimics other interfaces used.
Super small think is conflict_keys
is a domain specific word there might be something more clear. I like the use of needs
in other parts of the API.
@wontruefree Glad to hear that this would be helpful!
I am definitely not sold on conflict_keys
but not sure what else... Here's some ideas, but I'm open to more!
unique_keys
target_keys
Honestly that's it. I'm having trouble thinking of anything else that better describes what this does
After reading a little it seems like there is more then a 'uniqueness' constraint you can put on a key. Which is why I am guessing why postrgres used the term conflict.
I like the idea of using conflicting
but that seems pedantic. Also I think the the other APIs in SaveOperation
use the terms columns
and keys
Yeah good point on columns. We tend to use that more so conflict_columns
sounds a bit clearer
I think conflicting
could be strange if you have multiple columns: conflicting_columns organization_id, email
. It almost looks as if you're saying those two columns conflict with each other. Thoughts?
I think the interface has the potential to be confusing if you leave the word conflict in it.
conflicting_columns organization_id, email
conflicting_keys organization_id, email
Also I like columns better now that I see them.
Hmm maybe upsert_keys
? Technically it'd generate the find_or_create
as well, but I don't think that is a big deal. Thoughts?
The more I think about it and the reasons behind on conflict I feel like the term conflict makes sense.
But for the base case the one that people will most likely use is uniqueness. So having unique_columns
makes sense.
Probably create_or_find
instead of find_or_create
like in Rails 6 would be better?