scala/scala-lang

Google results for https://www.scala-lang.org/ are marked with a "13 Feb 2015" timestamp

Closed this issue · 11 comments

Screenshot from 2025-01-10 11-40-39

Usually only "blog posts" or "threads" get a timestamp in the results, but it seems the scala landing page was wrongly indexed as such.

It could have a negative effect on users seeing it in the search results for the first time, creating a false impression that the Scala language or website has not been maintained for a while.

One potential explanation could be that the landing page used to be a blog and the timestamp of "the newest article" was never removed by Google afterwards.

PS someone noticed that "Feb 13, 2015" could be a wild translation from the newest Scala 2 version, "2.13.15"

I don't see sitemap in rsync output (https://scala-webapps.epfl.ch/jenkins/view/All/job/production_scala-lang.org-builder/8287/console ). I wonder if the sitemap isn't generated for some reason.

@fsalvi is this an area where you have any insight?

Well, I didn't see anything wrong server-side which could lead to this result.
Date is properly set (Last-Modified: Mon, 13 Jan 2025 13:45:09 GMT).
I would suggest giving a try at the solution proposed by Google:
https://developers.google.com/search/docs/appearance/publication-dates?hl=en
in the headers (_includes/headertop.html).
We would quickly see if it makes a difference.

use Google site to refresh?

Ok, there's no doubt it's because of the scala version seen on the page.
Now, the google result shows 13 Feb 2016!
Maybe the use of AI in google search engine... :-D

The html source code only contains: "Scala 2.13.16 and older releases"
We could try to add "Scala version 2.13.6 and older releases" instead, to see if google engine can understand the difference between a date and a software version.

@fsalvi is there no way to make this deterministic? do we have no choice but to try to outwit the weird guessing they're doing?

As suggested, we could try either to change a bit the text (eg add "Version 2....", or to try the publication-dates:
https://developers.google.com/search/docs/appearance/publication-dates?hl=en

I guess the easiest would be to try to just add "Version 2..." to see if it changes something.

FYI - sbt's website generates sitemap during the website build process (e.g. sbt/website#412), which puts sitemap_index.xml last modified date like this:

<?xml version='1.0' encoding='UTF-8'?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <sitemap>
        <loc>https://www.scala-sbt.org/sitemap.xml.gz</loc>
        <lastmod>2025-02-03</lastmod>
    </sitemap>
    <sitemap>
        <loc>https://www.scala-sbt.org/1.x/sitemap.xml.gz</loc>
        <lastmod>2025-02-03</lastmod>
    </sitemap>
</sitemapindex>

which in turn points to more sitemaps.

as an easy first step, let's try not including the Scala 2 version number on the front page: #1762

It worked!!