envoyproxy/envoy

How to fetch wasm module from github? getting 503 on remote fetch.

rahulanand16nov opened this issue · 12 comments

I am trying to fetch the wasm module from Github but getting 503 as soon as the request is sent from the Envoy. Can you please point out what is wrong with my config?

admin:
  access_log_path: /dev/null
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 9000
static_resources:
  listeners:
  - address:
      socket_address:
        address: 0.0.0.0
        port_value: 9095
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          codec_type: auto
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains:
              - "*"
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: backend_service
          http_filters:
          - name: envoy.filters.preauth.wasm
            typed_config:
              "@type": type.googleapis.com/udpa.type.v1.TypedStruct
              type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
              value:
                config:
                  name: "preauth-wasm"
                  root_id: "preauth-wasm"
                  configuration: 
                    "@type": type.googleapis.com/google.protobuf.StringValue
                    value: |
                      {}
                  vm_config:
                    runtime: "envoy.wasm.runtime.v8"
                    vm_id: "my_vm_id"
                    code:
                      remote:
                        http_uri:
                          uri: https://raw.githubusercontent.com/rahulanand16nov/wasm-shim/new-api/deploy/wasm_shim.wasm
                          cluster: remote_wasm
                          timeout: 10s
                        sha256: "04aa04c9c16ba4557a05d2f3a356f1ff76676a5a9ee2dde41716db3919077df6"
                    configuration:
                      "@type": type.googleapis.com/google.protobuf.StringValue
                      value: {}
                    allow_precompiled: true
          - name: envoy.filters.http.router
            typed_config: {}
  clusters:
  - name: backend_service
    connect_timeout: 0.25s
    type: strict_dns
    lb_policy: round_robin
    load_assignment:
      cluster_name: backend_service
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: backend_service
                port_value: 8000
  - name: remote_wasm
    type: strict_dns
    connect_timeout: 1s
    dns_refresh_rate: 5s
    load_assignment:
      cluster_name: remote_wasm
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: raw.githubusercontent.com,
                port_value: 443

Here is a piece of debug logs:

proxy_1            | [2022-03-10 10:56:37.854][9][debug][config] [source/common/config/remote_data_fetcher.cc:35] fetch remote data from [uri = https://raw.githubusercontent.com/rahulanand16nov/wasm-shim/new-api/deploy/wasm_shim.wasm]: start
proxy_1            | [2022-03-10 10:56:37.854][9][debug][router] [source/common/router/router.cc:486] [C0][S12704383556817186832] cluster 'remote_wasm' match for URL '/rahulanand16nov/wasm-shim/new-api/deploy/wasm_shim.wasm'
proxy_1            | [2022-03-10 10:56:37.854][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:1488] no healthy host for HTTP connection pool
proxy_1            | [2022-03-10 10:56:37.854][9][debug][http] [source/common/http/async_client_impl.cc:101] async http request response headers (end_stream=false):
proxy_1            | ':status', '503'
proxy_1            | 'content-length', '19'
proxy_1            | 'content-type', 'text/plain'
proxy_1            | 
proxy_1            | [2022-03-10 10:56:37.854][9][debug][config] [source/common/config/remote_data_fetcher.cc:69] fetch remote data [uri = https://raw.githubusercontent.com/rahulanand16nov/wasm-shim/new-api/deploy/wasm_shim.wasm]: response status code 503

503 means not available, I can not access the link, is it private? If so, you can not use this link since GH would deny your request when others visit a private repo.

phlax commented

wfm https://raw.githubusercontent.com/rahulanand16nov/wasm-shim/new-api/deploy/wasm_shim.wasm

might have been a transient github issue - worth trying again - and if it fails just confirming that you can access the url

Yep, now I can access it.

The issue is still there, and I cannot find out the reason behind it. I assume it's related to TLS not being configured correctly.

Following is the cluster config:

{
     "version_info": "2022-03-14T08:15:59Z/15",
     "cluster": {
      "@type": "type.googleapis.com/envoy.config.cluster.v3.Cluster",
      "name": "outbound|443||raw.githubusercontent.com",
      "type": "STRICT_DNS",
      "connect_timeout": "10s",
      "circuit_breakers": {
       "thresholds": [
        {
         "max_connections": 4294967295,
         "max_pending_requests": 4294967295,
         "max_requests": 4294967295,
         "max_retries": 4294967295,
         "track_remaining": true
        }
       ]
      },
      "dns_refresh_rate": "5s",
      "dns_lookup_family": "V4_ONLY",
      "metadata": {
       "filter_metadata": {
        "istio": {
         "default_original_port": 443,
         "services": [
          {
           "host": "raw.githubusercontent.com",
           "namespace": "default",
           "name": "raw.githubusercontent.com"
          }
         ]
        }
       }
      },
      "common_lb_config": {
       "locality_weighted_lb_config": {}
      },
      "load_assignment": {
       "cluster_name": "outbound|443||raw.githubusercontent.com",
       "endpoints": [
        {
         "locality": {},
         "lb_endpoints": [
          {
           "endpoint": {
            "address": {
             "socket_address": {
              "address": "raw.githubusercontent.com",
              "port_value": 443
             }
            }
           },
           "metadata": {
            "filter_metadata": {
             "istio": {
              "workload": ";;;;"
             }
            }
           },
           "load_balancing_weight": 1
          }
         ],
         "load_balancing_weight": 1
        }
       ]
      },
      "respect_dns_ttl": true,
      "filters": [
       {
        "name": "istio.metadata_exchange",
        "typed_config": {
         "@type": "type.googleapis.com/envoy.tcp.metadataexchange.config.MetadataExchange",
         "protocol": "istio-peer-exchange"
        }
       }
      ]
     },
     "last_updated": "2022-03-14T08:15:59.982Z"
    },

Can you guys please share what's wrong with the config above?

phlax commented

i havent used uri based config before - so i cant comment directly on your config

that said, i reckon you are probably correct that the missing tls config is preventing it from working

take a look at the configuration examples in the tls sandbox - for example here:

clusters:
- name: service-https
type: STRICT_DNS
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: service-https
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: service-https
port_value: 443
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext

Are you sure that we don't need to provide more info than those last four lines in the config?

Here is the trace dumb of the first start:
https://pastebin.com/E1KQshjj

cc @envoyproxy/wasm-dev

  1. As mentioned on Slack a few days ago, you're connecting to port 443, but not configuring TLS transport socket. You need to add this to the cluster configuration:
transport_socket:
  name: envoy.transport_sockets.tls
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
    sni: "raw.githubusercontent.com"
  1. The cluster's address in your config is wrong. It's set to "raw.githubusercontent.com," (notice the trailing comma), which leads to DNS resolution failures (dns resolution for raw.githubusercontent.com, failed with c-ares status 4 in the logs). If you change it to "raw.githubusercontent.com" (without trailing comma), then everything works.

Thanks, @PiotrSikora!

That's embarrassing, so much pain just because of a trailing comma. Regardless, Error still remains:

proxy_1            | [2022-03-15 07:57:07.475][9][debug][upstream] [source/common/upstream/upstream_impl.cc:1183] initializing Primary cluster remote_wasm completed
proxy_1            | [2022-03-15 07:57:07.475][9][debug][init] [source/common/init/manager_impl.cc:49] init manager Cluster remote_wasm contains no targets
proxy_1            | [2022-03-15 07:57:07.475][9][debug][init] [source/common/init/watcher_impl.cc:14] init manager Cluster remote_wasm initialized, notifying ClusterImplBase
proxy_1            | [2022-03-15 07:57:07.475][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:1053] adding TLS cluster remote_wasm
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:1116] membership update for TLS cluster remote_wasm added 4 removed 0
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:135] cm init: init complete: cluster=remote_wasm primary=1 secondary=0
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:155] maybe finish initialize state: 1
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:164] maybe finish initialize primary init clusters empty: false
proxy_1            | [2022-03-15 07:57:07.476][9][trace][dns] [source/extensions/network/dns_resolver/cares/dns_impl.cc:278] Setting DNS resolution timer for 5000 milliseconds
proxy_1            | [2022-03-15 07:57:07.476][9][debug][dns] [source/extensions/network/dns_resolver/cares/dns_impl.cc:130] dns resolution for backend_service failed with c-ares status 1
proxy_1            | [2022-03-15 07:57:07.476][9][trace][dns] [source/extensions/network/dns_resolver/cares/dns_impl.cc:278] Setting DNS resolution timer for 5000 milliseconds
proxy_1            | [2022-03-15 07:57:07.476][9][debug][dns] [source/extensions/network/dns_resolver/cares/dns_impl.cc:236] dns resolution for backend_service completed with status 0
proxy_1            | [2022-03-15 07:57:07.476][9][trace][upstream] [source/common/upstream/strict_dns_cluster.cc:113] async DNS resolution complete for backend_service
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/upstream_impl.cc:256] transport socket match, socket default selected for host with address 172.19.0.2:8000
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/strict_dns_cluster.cc:150] DNS hosts have changed for backend_service
proxy_1            | [2022-03-15 07:57:07.476][9][trace][upstream] [source/common/upstream/upstream_impl.cc:1494] Local locality: 
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/strict_dns_cluster.cc:178] DNS refresh rate reset for backend_service, refresh rate 5000 ms
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/upstream_impl.cc:1183] initializing Primary cluster backend_service completed
proxy_1            | [2022-03-15 07:57:07.476][9][debug][init] [source/common/init/manager_impl.cc:49] init manager Cluster backend_service contains no targets
proxy_1            | [2022-03-15 07:57:07.476][9][debug][init] [source/common/init/watcher_impl.cc:14] init manager Cluster backend_service initialized, notifying ClusterImplBase
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:1053] adding TLS cluster backend_service
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:1116] membership update for TLS cluster backend_service added 1 removed 0
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:135] cm init: init complete: cluster=backend_service primary=0 secondary=0
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:155] maybe finish initialize state: 1
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:164] maybe finish initialize primary init clusters empty: true
proxy_1            | [2022-03-15 07:57:07.476][9][debug][init] [source/common/init/manager_impl.cc:49] init manager RTDS contains no targets
proxy_1            | [2022-03-15 07:57:07.476][9][debug][init] [source/common/init/watcher_impl.cc:14] init manager RTDS initialized, notifying RTDS
proxy_1            | [2022-03-15 07:57:07.476][9][info][runtime] [source/common/runtime/runtime_impl.cc:446] RTDS has finished initialization
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:225] continue initializing secondary clusters
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:155] maybe finish initialize state: 2
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:164] maybe finish initialize primary init clusters empty: true
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:179] maybe finish initialize secondary init clusters empty: true
proxy_1            | [2022-03-15 07:57:07.476][9][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:201] maybe finish initialize cds api ready: false
proxy_1            | [2022-03-15 07:57:07.476][9][info][upstream] [source/common/upstream/cluster_manager_impl.cc:207] cm init: all clusters initialized
proxy_1            | [2022-03-15 07:57:07.476][9][info][main] [source/server/server.cc:849] all clusters initialized. initializing init manager
proxy_1            | [2022-03-15 07:57:07.476][9][debug][init] [source/common/init/manager_impl.cc:53] init manager Server initializing
proxy_1            | [2022-03-15 07:57:07.476][9][debug][init] [source/common/init/target_impl.cc:15] init manager Server initializing target Listener-init-target 33eac52b-9894-4716-82c4-deba4484acba
proxy_1            | [2022-03-15 07:57:07.476][9][debug][init] [source/common/init/manager_impl.cc:53] init manager Listener-local-init-manager 33eac52b-9894-4716-82c4-deba4484acba 9804556647422129620 initializing
proxy_1            | [2022-03-15 07:57:07.476][9][debug][init] [source/common/init/target_impl.cc:15] init manager Listener-local-init-manager 33eac52b-9894-4716-82c4-deba4484acba 9804556647422129620 initializing target RemoteAsyncDataProvider
proxy_1            | [2022-03-15 07:57:07.476][9][debug][config] [source/common/config/remote_data_fetcher.cc:35] fetch remote data from [uri = https://raw.githubusercontent.com/rahulanand16nov/wasm-shim/new-api/deploy/wasm_shim.wasm]: start
proxy_1            | [2022-03-15 07:57:07.476][9][debug][router] [source/common/router/router.cc:486] [C0][S10333631843420410560] cluster 'remote_wasm' match for URL '/rahulanand16nov/wasm-shim/new-api/deploy/wasm_shim.wasm'
proxy_1            | [2022-03-15 07:57:07.476][9][debug][router] [source/common/router/router.cc:702] [C0][S10333631843420410560] router decoding headers:
proxy_1            | ':path', '/rahulanand16nov/wasm-shim/new-api/deploy/wasm_shim.wasm'
proxy_1            | ':authority', 'raw.githubusercontent.com'
proxy_1            | ':method', 'GET'
proxy_1            | ':scheme', 'http'
proxy_1            | 'x-envoy-internal', 'true'
proxy_1            | 'x-forwarded-for', '172.19.0.3'
proxy_1            | 'x-envoy-expected-rq-timeout-ms', '10000'
proxy_1            | 
proxy_1            | [2022-03-15 07:57:07.476][9][debug][pool] [source/common/http/conn_pool_base.cc:74] queueing stream due to no available connections
proxy_1            | [2022-03-15 07:57:07.476][9][debug][pool] [source/common/conn_pool/conn_pool_base.cc:267] trying to create new connection
proxy_1            | [2022-03-15 07:57:07.476][9][trace][pool] [source/common/conn_pool/conn_pool_base.cc:268] ConnPoolImplBase 0x5eeeff6a0190, ready_clients_.size(): 0, busy_clients_.size(): 0, connecting_clients_.size(): 0, connecting_stream_capacity_: 0, num_active_streams_: 0, pending_streams_.size(): 1 per upstream preconnect ratio: 1
proxy_1            | [2022-03-15 07:57:07.476][9][debug][pool] [source/common/conn_pool/conn_pool_base.cc:144] creating a new connection
proxy_1            | [2022-03-15 07:57:07.476][9][debug][client] [source/common/http/codec_client.cc:60] [C0] connecting
proxy_1            | [2022-03-15 07:57:07.476][9][debug][connection] [source/common/network/connection_impl.cc:896] [C0] connecting to [2606:50c0:8000::154]:443
proxy_1            | [2022-03-15 07:57:07.476][9][debug][connection] [source/common/network/connection_impl.cc:921] [C0] immediate connect error: 99
proxy_1            | [2022-03-15 07:57:07.476][9][trace][pool] [source/common/conn_pool/conn_pool_base.cc:130] not creating a new connection, shouldCreateNewConnection returned false.
proxy_1            | [2022-03-15 07:57:07.476][9][warning][main] [source/server/server.cc:747] there is no configured limit to the number of allowed active connections. Set a limit via the runtime key overload.global_downstream_max_connections
proxy_1            | [2022-03-15 07:57:07.476][9][trace][connection] [source/common/network/connection_impl.cc:562] [C0] socket event: 3
proxy_1            | [2022-03-15 07:57:07.477][9][debug][connection] [source/common/network/connection_impl.cc:573] [C0] raising immediate error
proxy_1            | [2022-03-15 07:57:07.477][9][debug][connection] [source/common/network/connection_impl.cc:249] [C0] closing socket: 0
proxy_1            | [2022-03-15 07:57:07.477][9][trace][connection] [source/common/network/connection_impl.cc:417] [C0] raising connection event 0
proxy_1            | [2022-03-15 07:57:07.477][9][debug][client] [source/common/http/codec_client.cc:110] [C0] disconnect. resetting 0 pending requests
proxy_1            | [2022-03-15 07:57:07.477][9][debug][pool] [source/common/conn_pool/conn_pool_base.cc:443] [C0] client disconnected, failure reason: immediate connect error: 99
proxy_1            | [2022-03-15 07:57:07.477][9][debug][router] [source/common/router/router.cc:1156] [C0][S10333631843420410560] upstream reset: reset reason: connection failure, transport failure reason: immediate connect error: 99
proxy_1            | [2022-03-15 07:57:07.477][9][debug][http] [source/common/http/async_client_impl.cc:101] async http request response headers (end_stream=false):
proxy_1            | ':status', '503'
proxy_1            | 'content-length', '146'
proxy_1            | 'content-type', 'text/plain'
proxy_1            | 
proxy_1            | [2022-03-15 07:57:07.477][9][trace][http] [source/common/http/async_client_impl.cc:118] async http request response data (length=146 end_stream=true)
proxy_1            | [2022-03-15 07:57:07.477][9][debug][config] [source/common/config/remote_data_fetcher.cc:69] fetch remote data [uri = https://raw.githubusercontent.com/rahulanand16nov/wasm-shim/new-api/deploy/wasm_shim.wasm]: response status code 503
proxy_1            | [2022-03-15 07:57:07.477][9][debug][config] [./source/common/config/datasource.h:87] Failed to fetch remote data, failure reason: 0
proxy_1            | [2022-03-15 07:57:07.477][9][debug][config] [./source/common/config/datasource.h:99] Remote data provider will retry in 608 ms.
proxy_1            | [2022-03-15 07:57:07.477][9][trace][main] [source/common/event/dispatcher_impl.cc:227] item added to deferred deletion list (size=1)
proxy_1            | [2022-03-15 07:57:07.477][9][trace][pool] [source/common/conn_pool/conn_pool_base.cc:130] not creating a new connection, shouldCreateNewConnection returned false.
proxy_1            | [2022-03-15 07:57:07.477][9][trace][main] [source/common/event/dispatcher_impl.cc:227] item added to deferred deletion list (size=2)
proxy_1            | [2022-03-15 07:57:07.477][9][debug][pool] [source/common/conn_pool/conn_pool_base.cc:410] invoking idle callbacks - is_draining_for_deletion_=false
proxy_1            | [2022-03-15 07:57:07.477][9][trace][upstream] [source/common/upstream/cluster_manager_impl.cc:1579] Erasing idle pool for host 0x5eeeffc544b0
proxy_1            | [2022-03-15 07:57:07.477][9][trace][main] [source/common/event/dispatcher_impl.cc:227] item added to deferred deletion list (size=3)
proxy_1            | [2022-03-15 07:57:07.477][9][trace][upstream] [source/common/upstream/cluster_manager_impl.cc:1586] Pool container empty for host 0x5eeeffc544b0, erasing host entry
proxy_1            | [2022-03-15 07:57:07.477][9][trace][main] [source/common/event/dispatcher_impl.cc:112] clearing deferred deletion list (size=3)

I suspect the docker network issue this time around.

[debug][connection] [...] [C0] connecting to [2606:50c0:8000::154]:443
[debug][connection] [...] [C0] immediate connect error: 99

It looks like failure to connect over IPv6. Try adding dns_lookup_family: v4_only to the cluster configuration.

It works now!

For friends from the future, here is my final config:

    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          codec_type: auto
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains:
              - "*"
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: backend_service
          http_filters:
          - name: envoy.filters.preauth.wasm
            typed_config:
              "@type": type.googleapis.com/udpa.type.v1.TypedStruct
              type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
              value:
                config:
                  name: "preauth-wasm"
                  root_id: "preauth-wasm"
                  configuration: 
                    "@type": type.googleapis.com/google.protobuf.StringValue
                    value: {}
                  vm_config:
                    runtime: "envoy.wasm.runtime.v8"
                    vm_id: "my_vm_id"
                    code:
                      remote:
                        http_uri:
                          uri: https://raw.githubusercontent.com/rahulanand16nov/wasm-shim/new-api/deploy/wasm_shim.wasm
                          cluster: remote_wasm
                          timeout: 10s
                        sha256: "04aa04c9c16ba4557a05d2f3a356f1ff76676a5a9ee2dde41716db3919077df6"
                    configuration:
                      "@type": type.googleapis.com/google.protobuf.StringValue
                      value: {}
                    allow_precompiled: true
          - name: envoy.filters.http.router
            typed_config: {}
  clusters:
  - name: backend_service
    connect_timeout: 0.25s
    type: strict_dns
    lb_policy: round_robin
    load_assignment:
      cluster_name: backend_service
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: backend_service
                port_value: 8000
  - name: remote_wasm
    type: strict_dns
    connect_timeout: 1s
    dns_refresh_rate: 5s
    dns_lookup_family: v4_only
    load_assignment:
      cluster_name: remote_wasm
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: raw.githubusercontent.com
                port_value: 443
    transport_socket:
      name: envoy.transport_sockets.tls
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
        sni: "raw.githubusercontent.com"

@PiotrSikora @phlax @ggreenway @daixiang0 You guys are the best! closing-issue-dopamine-hit is now available ;)