Account for redirects when selecting secondary page URL
rviscomi opened this issue · 2 comments
rviscomi commented
The secondary page for the Web Almanac site is the home page itself:
SELECT
is_root_page,
page
FROM
`httparchive.all.pages`
WHERE
date = '2022-08-01'
AND client = 'desktop'
AND root_page = 'https://almanac.httparchive.org/'The root page redirects to the 2021 edition, and the largest anchor seems to be the logo, which points back to the 2021 home page, so we seem to have a duplicate page with pre- and post-redirect URLs.
pmeenan commented
It currently checks against the url that the test originally navigated to. It should be trivial to modify the custom metric to also exclude based on the current location. I think it's all within the custom metric since it is about the final URL, not the initial URL.
One thing we won't be able to detect though is other URLs that redirect back to the same URL (i.e. if /en/2021/ redirected to /).
