jekyll/jekyll-seo-tag

Strange /index postfix after URLs

sasadangelo opened this issue · 6 comments

Hi,
This is my website: https://www.code4projects.net/
I use Jekyll SEO Tags and I noticed that canonical URLs for categories and blog pages have an additional /index at the end of URLs like here:
https://www.code4projects.net/category/cloud/
https://www.code4projects.net/blog/

I migrated my blog from Wordpress where I used permalinks with trailing slashes.

Source code of my blog is here:
https://github.com/sasadangelo/code4projects

Template for blog and categories are here:
https://raw.githubusercontent.com/sasadangelo/code4projects/main/blog/index.md
https://raw.githubusercontent.com/sasadangelo/code4projects/main/category/cloud.md

in both the cases permalink should be:

<website>/blog/
<website>/category/cloud/

but in link canonical Jekyll SEO Tags add an additional /index and in link next/prev add /index.html.
This generates a lot of problems in Google Search Console.
What can I do to fix the issues?

I suggest trying permalink: pretty in the config file.

According to the docs:
https://jekyllrb.com/docs/permalinks/#builtinpermalinkstyles

permalink: pretty

means:

permalink: /:categories/:year/:month/:day/:title/

I migrated a Wordpress blog that used the permalink structure mentioned above (no month, no day, no year) I don't want to change permalink structure otherwise my page indexed on Google will be broken. Moreover, Github Pages doesn't allow 301 redirect. So my only solution is to mantain my old link structure.

Going back to my question? Why Jekyll SEO Tags adds this extra /index in blog canonical and link next/prev and why it adds extra /index.html for categories. I specified for each category and for blog this permalink structure:

permalink: /blog/
permalink: /category/media/

Why Jekyll SEO Tags doesn't respect the desired permalink declared?
Can you elaborate a bit your answer?

This is how the plugin implements canonical_url:

      def canonical_url
        @canonical_url ||= begin
          if page["canonical_url"].to_s.empty?
            filters.absolute_url(page["url"]).to_s.gsub(%r!/index\.html$!, "/")
          else
            page["canonical_url"]
          end
        end
      end

Your site has a custom implementation of the same method via a local plugin:

        def canonical_url
          @canonical_url ||= begin
            if page["canonical_url"].to_s.empty?
              filters.absolute_url(page["url"]).to_s.gsub(%r!\.html$!, "")
            else
              page["canonical_url"]
            end
          end
        end

I don't see any explanation behind your custom implementation in the commit message. So, I would try deleting the custom code and see if that fixes the issue.

Custom implementation? Oh you're right. Now I remember.
Initially, I wasn't aware canonical URL was managed by Jekyll SEO tags. So I added the link canonical in my template. When I published my website and verified SEO with an online tool (Page Speed if I well remember) I realized I had two canonical URLs, one implemented by me and one by Jekyll SEO Tag. So I removed the mine but I completely forgot there was that custom implementation I copied from somewhere (probably Stack Overflow).
Now I fixed the problem according to your direction and verified all the links in my website with Screaming Frog and everything seems fine.

The only problem I see (that Screming Frog doesn't detect) is these link elements in blog and category pages:

<link rel="prev" href="https://www.code4projects.net/blog/index.html" />
<link rel="next" href="https://www.code4projects.net/blog/page/3/index.html" />

that use a format different by the canonical URL adding the index.html at the end. Is this line missing for these items in your code?

filters.absolute_url(page["url"]).to_s.gsub(%r!/index\.html$!, "/")

On this page you can see the problem:
https://www.code4projects.net/blog/page/2/

Yes, I see the issue in the pagination-pages. The reason for that is:

{% if paginator.previous_page %}
<link rel="prev" href="{{ paginator.previous_page_path | absolute_url }}" />
{% endif %}
{% if paginator.next_page %}
<link rel="next" href="{{ paginator.next_page_path | absolute_url }}" />
{% endif %}

Complaints about above code should be with a new issue ticket. The current ticket seems to be resolved at this point.

Ok thx. I am going to open it. Thank you for your help.