Would there be reset to snapshot feature someday?
Opened this issue · 12 comments
I came across a problem: each time i start my postgres test container, i need it to be cleaned up after each test, which is what Respawn specializes in. But what if I want to rollback to some snapshot of a database, to deal with same seeded data?
So according to a #53 (comment), the question is: are there any plans of implementing some kind of reset to snapshot feature?
How would that work?
I have an idea of using memento pattern. Caretaker will store all db snapshots, where a snapshot could be, for example, an instance of a database fixture.
What do you think?
One thing to keep in mind is that Respawn was built with speed in mind because often I'll have 100s of integration tests. You'll need to carefully design how this snapshotting would work in practice, because if it's much slower than re-inserting data, I wouldn't try to use it. This design of Respawn was the end result of benchmarking many different strategies for wiping/recreating a database.
Sure, I'll keep you informed with any success just here
I've implemented some kind of simplistic "snapshoting" for scenarios where we can assume that primary keys on all tables are monotonically increasing (it will not work as is for example on GUID primary keys).
In this case we could record the initial PK offset for all included tables, and extend the delete statements with a PK offset condition. My solution worked for composite keys as well. The performance impact should be OK, as the condition can most likely use indexes.
@jbogard if you're open to the idea as an opt-in feature, I can try to put it together on a branch and see what we can come up with at the end.
Again, it's important to look at the overall performance impact. Some solutions have 100s or 1000s of integration tests and even a small percentage increase can have a huge impact. I would take a much closer look at improving the tests themselves before going bigger to this solution.
(I'm same as 'zoltantamasi-ps' above, sorry for the confusion, just using my personal account now)
Thank you for your points.
I would take a much closer look at improving the tests themselves before going bigger to this solution.
At us it was not needed due to poor test quality or anything like that, but because we have quite a lot of pre-seeded test data in the database (using a separate console app for that purpose, out of the tests' context). We needed a way to keep this pre-seeded data while still be able to have a fast way to reset to this point.
Again, it's important to look at the overall performance impact. Some solutions have 100s or 1000s of integration tests and even a small percentage increase can have a huge impact.
Yes, I definitely agree, this is why my suggestion was to introduce it as an opt-in feature.
An option to reset to a snapshot version of data could be nice.
E.g. in my tests I spin up a fresh DB container and migrate the database using EF Core. This includes EntityTypeBuilder<T>.HasData(...)
calls from EF Core in the model building context.
When using Respawner.ResetDatabase()
, the database is reset, and the data, which was seeded using those EntityTypeBuilder<T>.HasData(...)
calls is now gone.
No one has come up with a viable solution to that request that's remotely close in speed to the current solution.
I'm in the same situation, needing to reset the database to a snapshot state between integration tests. I appreciate that Respawn isn't intended for this, and is optimised for speed by deleting all data.
I'm wondering what a good strategy is to achieve this? In my case I want to do this in a Devops pipeline against a Docker containerised SQL Server database where the data is initially seeded using a SQL Server Integration Services package that pulls a data subset from a copy of a production database. Probably a sql server backup or snapshot approach, but that makes me think it could be very slow?!
@chrisrlewis why do you need data from a production db? what scenarios can't you just set up from scratch? The only thing I have is pre populated tables that are then ignored by respawner. Then a bunch of utilities to quickly setup scenarios.
It's data from an anonymised clone of production, and we do it this way as a closest approximation to real-world testing - we have processes that depend on the state of existing data.
You're right, in an ideal world we would build up scenarios for everything from scratch. Resources and timescales though...
I'm experimenting with database snapshots, restored between each test - and so far that works and is surprisingly quick. How this scales is TBD.