support for not using trailing slashes
esped-dfds opened this issue · 8 comments
we have a website today where we don't use trailing slashes. We have implemented this in a cloudfront lambda where all requests without extension gets appended a "/index.html"
I would like to use this plugin in order to fix caching, cleaning up unused files, and 301 for redirects.
The plugin seems promising, im using the generateRedirectObjectsForPermanentRedirects since we have many redirects. I cant seem to find a way to make this work without going back to trailing slashes.
One idea I had was to generate the redirect object as {redirect-name}/index.html - would you consider accepting such an options if I send a PR?
Hi,
What's your use case for removing trailing slashes? Trailing slashes are standard on S3 Static Website Hosting, is it really worth fighting it?
Since you're using Lambda@Edge, my suggestion would be to use an Origin Response Lambda to remove training slashes from the Location
header.
If this solution doesn't work for you, let us know. We can consider adding an option to do what you're asking for, but given that the requirement seems quite niche I'm not sure it is worth us supporting.
One idea I had was to generate the redirect object as {redirect-name}/index.html
Wouldn't it be better to just generate the redirect object without the trailing slash? If you add index.html
to it then that will appear to the end user, which I think is a bit ugly.
Thanks!
Hi,
What's your use case for removing trailing slashes? Trailing slashes are standard on S3 Static Website Hosting, is it really worth fighting it?
Good question. It is something we want in my company. Primarely because the urls look nicer.
Since you're using Lambda@Edge, my suggestion would be to use an Origin Response Lambda to remove training slashes from the
Location
header.
I dont think that would work. I think it would introduce an infinite loop.
If this solution doesn't work for you, let us know. We can consider adding an option to do what you're asking for, but given that the requirement seems quite niche I'm not sure it is worth us supporting.
I think it would be really great. Googling trailing slashes yields a lot of results so seem like somehing people care about.
One idea I had was to generate the redirect object as {redirect-name}/index.html
Wouldn't it be better to just generate the redirect object without the trailing slash? If you add
index.html
to it then that will appear to the end user, which I think is a bit ugly.
This is how the plugin works today as I understand it. The user would not see this as it only happens between cloudfront and s3
Thanks for your response. I think I misunderstood your question.
First, let's make sure we're using the same terminology and clarify the difference between rewrites and redirects:
-
Rewrites are achieved by modifying the
uri
in a Lambda@Edge Request Lambda (either Viewer Request or Origin Request, preferably Origin Request because then it can be cached). -
Redirects are achieved by specifying a 301/302/307/308 Status Code and a Location header. This can be done with S3 Static Website Hosting Redirection Rules, the S3 Static Website Hosting
x-amz-website-redirect-location
metadata (301 redirects only), or through Lambda@Edge.
One idea I had was to generate the redirect object as {redirect-name}/index.html
I interpreted this as meaning you wanted to set the redirect destination as {redirect-destination}/index.html
. I now think you mean you want to create the redirect object with the key {redirect-source}/index.html
. Is this correct?
We already do this, but only if the fromPath
passed to createRedirect
has a trailing slash. This is likely the cause of your issue. You're right that this can't be resolved with just Lambda@Edge, at least not in any performant way.
If this is the case, I'd be happy to review a PR that adds an option to add /index.html
to object key when dealing with any extensionless fromPath
value.
I interpreted this as meaning you wanted to set the redirect destination as
{redirect-destination}/index.html
. I now think you mean you want to create the redirect object with the key{redirect-source}/index.html
. Is this correct?
Exactly
We already do this, but only if the
fromPath
passed tocreateRedirect
has a trailing slash. This is likely the cause of your issue. You're right that this can't be resolved with just Lambda@Edge, at least not in any performant way.
Great - initial tests seem to solve my issue. I'll test it a bit more
If this is the case, I'd be happy to review a PR that adds an option to add
/index.html
to object key when dealing with any extensionlessfromPath
value.
Yes would be nice to have the option to always generate with /index.html to make things a bit more explicit. For now adding the slash should be sufficient. Thanks for helping.
Hi guys,
I am not sure this issue is quite niche, and I would also like a feature to have no trailing slash at the end of my urls.
My current use case is that I made a new version of a website using gatsby, on the old website all urls have no trailling slash at the end, so for SEO purpose I would like to have the same behaviour.
This option will behave like this:
- create
pagename
with the correct metadata:Content-Type: text/html
- create
pagename/index.html
with the redirect Metadata towardpagename
I think for SEO this should be a must have and I would go even further with the possibility to actually choose if we want some path with trailing slash and some without (list, or option inside the page).
For example blog/ is the trailing slash is needed because it is technically a folder but blog/myarticle there shouldn't be a trailing slash because it is a file.
What do you think ?
There's a lot of misinformation about SEO. Low quality blogs regurgitate information, but often misunderstand it or deliberately change it to be sensationalist/clickbait. It's best to go straight to the source. On the topic of trailing slashes, the source is here and here. (Notice on that tweet that John Mueller specifically calls out some SEO "experts" for an article that is "misleading and wrong" and "unnecessarily scaremongering")
The gist is, it doesn't matter whether you use a trailing slash or not. Google treats URLs with/without trailing slashes separately, but equally. You should be consistent about using/not-using a trailing slash in order to avoid being penalised for duplicate content, but it doesn't matter which one you choose.
The folder/file distinction refers to pagename/
(folder) or pagename/index.html
(file). Considering pagename
as an extensionless file is also valid, but there's no reason to consider listing pages as folders and content pages as files. On many websites the line between listing and content is blurred. It's more intuitive for viewers if you either consistently have a trailing slash, or consistently don't have one. From an SEO perspective it doesn't matter at all, because Google treats pages with/without a trailing slash "separately but equally". In other words, Google doesn't care if something is a folder or a file, it just treats everything as a page.
My current use case is that I made a new version of a website using gatsby, on the old website all urls have no trailling slash at the end, so for SEO purpose I would like to have the same behaviour.
There is a valid point here. If you have historically not had a trailing slash, your page's SEO value is stored on the path without a slash. If you 301 redirect from the non-slash variant to the slash-variant then the value will be transferred, but S3 Static Website Hosting for some reason adds trailing slashes via a 302 redirect instead. This could result in a short-term loss in search rank after you change to using trailing slashes. I would suggest using a Lambda@Edge Origin Response function to detect redirects that add a trailing slash and change them to a 301 redirect.
If you really want to avoid trailing slashes, I think the best option is to use Lambda@Edge in the same way esped-dfds is. This plugin is designed to work with S3 Static Website Hosting, and S3 Static Website Hosting uses trailing slashes. If you don't want trailing slashes, submit a feature request to Amazon, or use Lambda@Edge to implement the behaviour yourself, but I don't see fighting this behaviour as being in scope for this plugin.
- create
pagename
with the correct metadata:Content-Type: text/html
- create
pagename/index.html
with the redirect Metadata towardpagename
I don't think it's possible to create those objects in s3 because pagename would be a folder. How can that have content type text/html
It's better to sidestep the issue completely by handling it in a lambda@edge on your cloudfront by putting this in an origin request lambda(or maybe viewer request also works) to make sure it explicitly asks for files(and not folders) when requesting files from s3.
if (!request.uri.match(/\.[a-z0-9]+$/i)) {
request.uri = (request.uri + '/index.html').replace('//', '/');
}
I don't think it's possible to create those objects in s3 because pagename would be a folder. How can that have content type text/html
I just did a quick test, and it is possible. S3 doesn't really have a concept of folders and files, only of objects and prefixes. You can have an object with the key pagename
and an object with the key pagename/index.html
, it doesn't stop you. The S3 console will happily show you a "folder" and a "file" with the exact same name, side by side.
But I agree that Lambda@Edge is the better approach for this. For one thing, Lambda@Edge allows for a more generalised solution which works on non-Gatsby sites as well, or Gatsby sites that don't use this plugin for deployment. (E.g. Gatsby Cloud)