Feature Request: Upsert

Question

Feature Request: Upsert

nscimerical opened this issue 5 years ago · 20 comments

Is it possible to have atomic upsert functionality similar to postresql's INSERT ON CONFLICT or mysql's INSERT ON DUPLICATE KEY UPDATE statement? Use case is to save roundtrips on remote databases and to minimize lines of code.

Answer 1 · 2019-11-04T08:09:46.000Z

This issue somehow depends on #127. Although, it's possible to define the conflict on any constraint.

I don't see any problem with adding this after #127 will be addressed.

Answer 2 · 2020-01-07T08:06:27.000Z

@a8m do you have already a syntax for upsert in mind? Since upsert is different in Postgres/Sqlite and Mysql, Ent should probably abstract that common case.

Just a draft of my current thinking:

pedro, err := client.Pet.
    UpsertOneID(id).
    Create(
	pet.Create().SetName("pedro")
    ).
    Update(
        pet.Update().SetOwnerID(owner)
    ).
    Save(ctx)

This would allow us to reuse the Settters for Create and Update and ensure that Create and Update can include different properties.

Answer 3 · 2020-05-20T07:01:21.000Z

This looks rather promising.

Answer 4 · 2020-09-14T17:52:49.000Z

Any ETA on this? :)

Answer 5 · 2020-10-01T03:03:21.000Z

I want one.

Answer 6 · 2020-10-26T07:46:06.000Z

I need it.

Answer 7 · 2020-11-09T21:47:03.000Z

yes please

Answer 8 · 2020-12-09T20:14:34.000Z

Sorry to bug but is this being worked on? 🤦🏾‍♂️

Answer 9 · 2021-01-03T04:45:40.000Z

Has anyone made a start on this? I'd like to contribute!

Answer 10 · 2021-01-03T06:21:31.000Z

A few thoughts on this — at least from a Postgres standpoint — using ON CONFLICT requires some sort of annotation to indicate that a field is unique (already supported in the schema) or supply it with a named constraint (also generated by using index.Field(...).Unique(). Beyond that we need to update the graph inserter with the necessary pieces to write the conflict statement. Assuming something like @chris-rock suggested above, I've started to put this together:

// Pet holds the schema definition for the Pet entity.
type Pet struct {
	ent.Schema
}

// Fields of the Pet.
func (Pet) Fields() []ent.Field {
	return []ent.Field{
		field.String("name"),
	}
}

func (Pet) Indexes() []ent.Index {
	return []ent.Index{
		index.Fields("name").Unique(),
	}
}

// Usage:
pedro, err := client.Pet.
    Upsert().
    Create().SetName("pedro").
    Update().SetName("pedro-2").
    Save(ctx)

WIP #1121

Answer 11 · 2021-01-03T06:34:48.000Z

Has anyone made a start on this? I'd like to contribute!

I started an experiment for handing this, the only open question about it is whether we want to add a new hook/policy type for UpsertOne and Upsert (bulk).

Regarding the API, I used something as follows:

client.Pet.
    Create().
    // Setters...
    OnConflict(...).
    Update()
    // Setters...
    Save(ctx)

Answer 12 · 2021-01-03T06:43:23.000Z

The explicit OnConflict API is nice — would it take a list of columns or options of some sort?

IIRC Postgres requires the conflict columns to be set to columns with a unique constraint and throws an error otherwise, would this be handled by Indexes/Field annotations?

Answer 13 · 2021-03-06T20:28:37.000Z

Also interested in bulk upsert. By now inserting or updating multiple objects can be done only one by one and that's very slow on large sets.

Answer 14 · 2021-03-12T06:48:23.000Z

@DanielTitkov I've been slowly chipping away at this. It's quite complex when introducing Bulk changes, but I'm pretty close to having a PoC with Postgres to shake out how the API feels.

Answer 15 · 2021-06-15T19:22:37.000Z

Hey all, and thanks @ivanvanderbyl for working on #1325.

We really want to push this feature forward, but I have some concerns/questions regarding the API (as mentioned in our Slack channel). I'll appreciate getting feedback/help/suggestions. Thanks 🙏

We prefer to start with UpsertOne and then add support for BulkUpsert (when we feel comfortable with the new API).
What about Hooks and Privacy, do we need to provide a new operation for it? If so, do we need to provide an API that indicates if an entity was created or updated (post hook/privacy-rule).
MySQL doesn't support the RETURNING clause, and SQLite added support for it 2-3 months ago. What should be the return value of the Upsert function? Do we need to provide a few alternatives like Exec/Save and fail on runtime if the driver doesn't support getting the rows back? Or, maybe we can handle this using SELECT FOR UPDATE?

@sneakywombat proposed a new API for it here - #1325 (comment). I actually like it, but using Where(...) is not possible with @ivanvanderbyl's implementation (right?). What about supporting both APIs?

// #1
client.User.Create().
    SetEmail("boring@entgo.io").
    OnConflict(user.FieldEmail).
    Update().
    SetEmail("boring2@entgo.io").
    Save(ctx)

// #2 (Will execute 2 statements to the databases. 1 for query records, and another one for creating or updating records). 
client.User.Upsert().
    Where(user.Email("boring@entgo.io")).
    Create(... /* ent.UserMutation */)
    Update(... /* ent.UserMutation */)
    Save(ctx).

cc @rotemtam, @alexsn

Answer 16 · 2021-06-19T18:52:44.000Z

Hey @a8m ,
My few 0.02$ on the issue:

We prefer to start with UpsertOne

I agree, start with UpsertOne. From my (limited) perspective, this is the common use case.

What about Hooks and Privacy, do we need to provide a new operation for it?

In general, as discussed, there are two approaches to upsert like behavior:

INSERT .. ON DUPLICATE KEY UPDATE
SELECT .. FOR UPDATE (SfU henceforth)
I think it will be much simpler to support upserts in ent using SfU since it fits very nicely with the current hooks/privacy definitions. In order for a mutation to succeed, it must pass both privacy checks instead of adding a new, somewhat weird "upsert" notion.
It has the additional advantage that, as you mentioned,

MySQL doesn't support the RETURNING clause

Using SfU simplifies this because we can then return the final value for any dialect.

The downside of course is SfU requires 2 round-trips to the database and holding a lock on a resource which in some use cases my cause substantial performance issues. I think those can be solved using something which was discussed in another issue (can't find it now), which would provide a generic API for modifying SQL queries in place in those edge cases were specific tweaking is required.

Answer 17 · 2021-07-05T07:29:56.000Z

Summarizing a discussion with @a8m, @ivanvanderbyl, and @arielitovsky from June 22nd

INSERT .. ON DUPLICATE do not map nicely to ent's current hooks/privacy infrastructure. to keep those in place, upsert APIs will be available out of the box for working with a single record and be implemented with SELECT .. FOR UPDATE.
INSERT .. ON DUPLICATE are useful for bulk upserts, mostly in data pipeline job style setting to improve performance by reducing the number of calls to the database.

These will either be implemented by adding an extension/feature flag to generate bulk upsert methods or by adding a "query mutation" modifier to make low-level modifications to queries before they are executed. Either way, the API and documentation should make it clear to the user that this path bypasses hooks and privacy checks and should be used with caution.

Answer 18 · 2021-08-02T16:53:50.000Z

Hey, the upsert and upsert-bulk APIs were added on #1793. Please, feel free to share your thoughts on the PR.

Answer 19 · 2021-11-07T19:20:56.000Z

Closing, as it's been used in production for more than 3 months and work without any issues. Thanks all for the feedback ❤️

Answer 20 · 2022-03-24T20:04:16.000Z

Can we unpin this issue? :D