nvkelso/natural-earth-vector

Website 404 errors and styling funk

rushgeo opened this issue · 4 comments

Most pages of the website are giving 404 errors where the main content column would be:

  • homepage
  • blog
  • downloads
  • issues
  • corrections (all white page)

The features and about pages work for me.

I can replicate the problem, investigating...

In the meantime the S3 downloads and GitHub downloads are still up.

Basic functionality is restored... working on restoring the custom site styling next.

Download requests without Referer header resulting in 500's from Wordpress starting this afternoon.

To reproduce provide a request without a referer header:

 curl -v https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip
 
   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 50.87.253.14:443...
* Connected to www.naturalearthdata.com (50.87.253.14) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
} [329 bytes data]
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* (304) (IN), TLS handshake, Unknown (8):
{ [19 bytes data]
* (304) (IN), TLS handshake, Certificate (11):
{ [4035 bytes data]
* (304) (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* (304) (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* (304) (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / AEAD-AES256-GCM-SHA384
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=naturalearthdata.com
*  start date: Jan  3 11:59:47 2024 GMT
*  expire date: Apr  2 11:59:46 2024 GMT
*  subjectAltName: host "www.naturalearthdata.com" matched cert's "www.naturalearthdata.com"
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify ok.
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: www.naturalearthdata.com]
* [HTTP/2] [1] [:path: /http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip]
* [HTTP/2] [1] [user-agent: curl/8.4.0]
* [HTTP/2] [1] [accept: */*]
> GET /http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip HTTP/2
> Host: www.naturalearthdata.com
> User-Agent: curl/8.4.0
> Accept: */*
> 
< HTTP/2 500 
< vary: Accept-Encoding,Cookie
< expires: Wed, 11 Jan 1984 05:00:00 GMT
< cache-control: no-cache, must-revalidate, max-age=0
< host-header: c2hhcmVkLmJsdWVob3N0LmNvbQ==
< x-endurance-cache-level: 2
< x-nginx-cache: WordPress
< content-type: text/html; charset=UTF-8
< date: Tue, 27 Feb 2024 21:36:20 GMT
< server: Apache
< 
{ [2799 bytes data]
100  2799    0  2799    0     0   7001      0 --:--:-- --:--:-- --:--:--  7015
* Connection #0 to host www.naturalearthdata.com left intact

Suppling an arbirtary referer header results in the redirect:

curl -v -H "Referer:http://google.com" https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip | pbcopy
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 50.87.253.14:443...
* Connected to www.naturalearthdata.com (50.87.253.14) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
} [329 bytes data]
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* (304) (IN), TLS handshake, Unknown (8):
{ [19 bytes data]
* (304) (IN), TLS handshake, Certificate (11):
{ [4035 bytes data]
* (304) (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* (304) (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* (304) (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / AEAD-AES256-GCM-SHA384
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=naturalearthdata.com
*  start date: Jan  3 11:59:47 2024 GMT
*  expire date: Apr  2 11:59:46 2024 GMT
*  subjectAltName: host "www.naturalearthdata.com" matched cert's "www.naturalearthdata.com"
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify ok.
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: www.naturalearthdata.com]
* [HTTP/2] [1] [:path: /http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip]
* [HTTP/2] [1] [user-agent: curl/8.4.0]
* [HTTP/2] [1] [accept: */*]
* [HTTP/2] [1] [referer: http://google.com]
> GET /http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip HTTP/2
> Host: www.naturalearthdata.com
> User-Agent: curl/8.4.0
> Accept: */*
> Referer:http://google.com
> 
< HTTP/2 302 
< vary: Accept-Encoding,Cookie
< location: https://naciscdn.org/naturalearth/10m/cultural/ne_10m_admin_1_states_provinces.zip
< host-header: c2hhcmVkLmJsdWVob3N0LmNvbQ==
< x-endurance-cache-level: 2
< x-nginx-cache: WordPress
< content-length: 0
< content-type: text/html; charset=UTF-8
< date: Tue, 27 Feb 2024 21:38:30 GMT
< server: Apache
< 
{ [0 bytes data]
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
* Connection #0 to host www.naturalearthdata.com left intact

This impacts script based downloads which generally do not include implicit browser headers like Referer.

Uff, thanks for the report. I'll work to debug, but in the meantime...

Per #581, any automatic download and/or scripts should switch over to directly accessing the naciscdn.org files instead of off Wordpress site.

Here are some file resource patterns in the meantime:

Official S3 public data: (EVERYONE SHOULD MIGRATE HERE, ESPECIALLY IF YOU USE CI)
https://naturalearth.s3.amazonaws.com/10m_cultural/10m_cultural.zip
(note 10m_cultural where naturalearthdata.com uses 10m/cultural with / not _ so you'll need to update more than your domain)

With the equivelant naciscdn.org pretty domain:
https://s3.us-west-2.amazonaws.com/naciscdn.org/naturalearth/10m/cultural/10m_cultural.zip
(same file structure as naturalearthdata.com)