flyerhzm/seo_checker

Naive detection of sitemap.xml

grimen opened this issue · 3 comments

Looking for sitemap.xml withour respecting the robots.txt-value. See Google docs on this.

I really consider the Sitemap command in robots.txt. You can check the line 44-51 in lib/seo_checker.rb


  def check_robot
    uri = URI.parse(@url)
    uri.path = '/robots.txt'
    response = get_response(uri)
    if response.is_a? Net::HTTPSuccess and response.body =~ /Sitemap:\s*(.*)/
      check_sitemap($1)
    end
  end

Could you give me some more information?

The thing is that it complained that my site didn't have "sitemap.xml", but it actually has...but I use the enterpise standard (support many sitemaps): sitemap-index.xml.gz (using gzip here). Works in browser too.

I test your issue but I can't reproduce your problem. Let's confirm two things.

  1. Do you use the latest seo_checker gem, the new gem version is 0.2.5
  2. Do you define the "Sitemap" in robots.txt? Here is an example:
    
    Sitemap: http://test.com/sitemap-index.xml.gz