Website 404 errors and styling funk
rushgeo opened this issue · 4 comments
Most pages of the website are giving 404 errors where the main content column would be:
- homepage
- blog
- downloads
- issues
- corrections (all white page)
The features and about pages work for me.
I can replicate the problem, investigating...
In the meantime the S3 downloads and GitHub downloads are still up.
Basic functionality is restored... working on restoring the custom site styling next.
Download requests without Referer header resulting in 500's from Wordpress starting this afternoon.
To reproduce provide a request without a referer header:
curl -v https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 50.87.253.14:443...
* Connected to www.naturalearthdata.com (50.87.253.14) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
} [329 bytes data]
* CAfile: /etc/ssl/cert.pem
* CApath: none
* (304) (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* (304) (IN), TLS handshake, Unknown (8):
{ [19 bytes data]
* (304) (IN), TLS handshake, Certificate (11):
{ [4035 bytes data]
* (304) (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* (304) (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* (304) (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / AEAD-AES256-GCM-SHA384
* ALPN: server accepted h2
* Server certificate:
* subject: CN=naturalearthdata.com
* start date: Jan 3 11:59:47 2024 GMT
* expire date: Apr 2 11:59:46 2024 GMT
* subjectAltName: host "www.naturalearthdata.com" matched cert's "www.naturalearthdata.com"
* issuer: C=US; O=Let's Encrypt; CN=R3
* SSL certificate verify ok.
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: www.naturalearthdata.com]
* [HTTP/2] [1] [:path: /http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip]
* [HTTP/2] [1] [user-agent: curl/8.4.0]
* [HTTP/2] [1] [accept: */*]
> GET /http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip HTTP/2
> Host: www.naturalearthdata.com
> User-Agent: curl/8.4.0
> Accept: */*
>
< HTTP/2 500
< vary: Accept-Encoding,Cookie
< expires: Wed, 11 Jan 1984 05:00:00 GMT
< cache-control: no-cache, must-revalidate, max-age=0
< host-header: c2hhcmVkLmJsdWVob3N0LmNvbQ==
< x-endurance-cache-level: 2
< x-nginx-cache: WordPress
< content-type: text/html; charset=UTF-8
< date: Tue, 27 Feb 2024 21:36:20 GMT
< server: Apache
<
{ [2799 bytes data]
100 2799 0 2799 0 0 7001 0 --:--:-- --:--:-- --:--:-- 7015
* Connection #0 to host www.naturalearthdata.com left intact
Suppling an arbirtary referer header results in the redirect:
curl -v -H "Referer:http://google.com" https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip | pbcopy
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 50.87.253.14:443...
* Connected to www.naturalearthdata.com (50.87.253.14) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
} [329 bytes data]
* CAfile: /etc/ssl/cert.pem
* CApath: none
* (304) (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* (304) (IN), TLS handshake, Unknown (8):
{ [19 bytes data]
* (304) (IN), TLS handshake, Certificate (11):
{ [4035 bytes data]
* (304) (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* (304) (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* (304) (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / AEAD-AES256-GCM-SHA384
* ALPN: server accepted h2
* Server certificate:
* subject: CN=naturalearthdata.com
* start date: Jan 3 11:59:47 2024 GMT
* expire date: Apr 2 11:59:46 2024 GMT
* subjectAltName: host "www.naturalearthdata.com" matched cert's "www.naturalearthdata.com"
* issuer: C=US; O=Let's Encrypt; CN=R3
* SSL certificate verify ok.
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: www.naturalearthdata.com]
* [HTTP/2] [1] [:path: /http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip]
* [HTTP/2] [1] [user-agent: curl/8.4.0]
* [HTTP/2] [1] [accept: */*]
* [HTTP/2] [1] [referer: http://google.com]
> GET /http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip HTTP/2
> Host: www.naturalearthdata.com
> User-Agent: curl/8.4.0
> Accept: */*
> Referer:http://google.com
>
< HTTP/2 302
< vary: Accept-Encoding,Cookie
< location: https://naciscdn.org/naturalearth/10m/cultural/ne_10m_admin_1_states_provinces.zip
< host-header: c2hhcmVkLmJsdWVob3N0LmNvbQ==
< x-endurance-cache-level: 2
< x-nginx-cache: WordPress
< content-length: 0
< content-type: text/html; charset=UTF-8
< date: Tue, 27 Feb 2024 21:38:30 GMT
< server: Apache
<
{ [0 bytes data]
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
* Connection #0 to host www.naturalearthdata.com left intact
This impacts script based downloads which generally do not include implicit browser headers like Referer
.
Uff, thanks for the report. I'll work to debug, but in the meantime...
Per #581, any automatic download and/or scripts should switch over to directly accessing the naciscdn.org files instead of off Wordpress site.
Here are some file resource patterns in the meantime:
Official S3 public data: (EVERYONE SHOULD MIGRATE HERE, ESPECIALLY IF YOU USE CI)
https://naturalearth.s3.amazonaws.com/10m_cultural/10m_cultural.zip
(note 10m_cultural where naturalearthdata.com uses 10m/cultural with / not _ so you'll need to update more than your domain)
With the equivelant naciscdn.org pretty domain:
https://s3.us-west-2.amazonaws.com/naciscdn.org/naturalearth/10m/cultural/10m_cultural.zip
(same file structure as naturalearthdata.com)