elementor/wp2static-addon-s3

Remove deleted pages from S3

john-shaffer opened this issue · 5 comments

Currently, deleted pages remain on S3 unless the user manually removes them. We should detect 404'd pages and delete them from the bucket.

Reported at https://staticword.press/t/deleted-pages-dont-disappear-from-s3/477

I didn't used to care about this too much and still don't like the idea of the plugin trying to decide what needs to be deleted (maintain state, which we can't unless we were the one to create the S3 bucket and it was never modified by any other process).

What I would like, to solve the use case of clients requiring this behaviour, is to instead, offer an option to "Empty bucket before deployment".

As they'll have Cloudfront in-front, shouldn't cause site issues thanks to caching. The CF invalidation request comes at end, at which point, site is in expected state.

If that's not enough for a user's requirements, I'd propose that they are a power/enterprise user to the point that they could either:

  • setup a blue/green deployment script, which publishes to a new bucket, then updates the CF distribution

or

  • employ the WP2Static crew to setup an Enterprise solution for them.

Emptying the bucket won't work because CloudFront's caching is "lazy". Most of the site will not be in-cache in most edge locations and will result in 404s.

I think it's okay for WP2Static to maintain the state of the objects that it created. This seems like expected basic functionality to me that really should be there, but I think there isn't a way to track 404s to do this properly. That may change with the advanced-crawling features, and then this will be pretty simple to implement.

Blue/green would be ideal though.

Oh, bummer about the CF caching.

OK, sounds totally reasonable, thanks @john-shaffer!

Could there be an option:

  • delete files we know should be deleted
  • delete any files not from our deployment

Or am I over-complicating things further?

Yeah, we could do that.

Hi, just checking if there was any progress on this feature request?

I agree with @john-shaffer about considering this a basic functionality. Let me share a few examples where it's essential for us:

We have a few brands in the tourism industry. There are several situations where we need to remove a page. The most frequent ones are:

  • When a promo is no longer active: we run promotions at certain times of the year, which are also referenced from social media or partner websites. We need to be able to remove those promo pages when they expire.
  • When a tour/product is on hold or permanently removed
  • On the Jobs section, when a position is no longer available

Our current workaround is creating a redirection to the homepage, but for certain situations (usually for security or privacy reasons) we need the websites to return a 404 error. In this last case, the content managers have to contact our hosting provider (different timezone) and coordinate a time when they can empty the S3 bucket, and at the same time, we export a new version of the site.

Hope those comments help to give more context and ideas as to why some of us consider this an important feature.