vmware/hamlet

ServiceID-SNI, ports and protocols

Closed this issue · 4 comments

Proposed changes

  • Change the name of the ServiceID field to SNI.
  • Make SNI optional so we can enable plaintext communications.
  • Add service endpoint ports. The goal is to be able to specify ports, one or more protocol per port, and an SNI per pair of [port, protocol] (mandatory if protocol==TLS). The SNI is still opaque and will encode the SNI header by the consumer SM to a particular owner SM. The proposed format is now

required Array Port { required int number, required string protocol, optional string SNI

Where
number. A valid non-negative integer port number.

protocol. Protocol exposed on the port. MUST BE one of HTTP|HTTPS|GRPC|HTTP2|MONGO|TCP|TLS. TLS implies the connection will be routed based on the SNI header to the destination.

SNI. The value of this field will be set as the SNI header by the federated service mesh consumer when creating a connection to the owner and has been originally set by the owner. Each vendor may possibly have its own SNI format, so this specification doesn't define a particular format to use for this field. This field is opaque for the consumer, it simply uses it.

Reasons

  • ServiceID is really an SNI. SNI only makes sense when there is TLS traffic. In the current version of the spec, traffic has to be TLS. However, there could be architectures where TLS is not needed or not even possible. This is a limitation of the spec and the main reason of this proposed change.

  • If ServiceID/SNI is made optional, we have to use either Name or FQDN as the primary key within the owner mesh instead of the ServiceID. Neither of these two alone can be really guaranteed to be unique no matter if we require it in the spec or not. However a combination of FQDN (if mandatory) and Port can be, or a combination of Name (if FQDN is optional) and Port. However, Port is not part of the current spec, but encoded in the ServiceID/SNI (which is currently mandatory but also opaque for the consumer mesh, as it does not know how to interpret its content).

  • Currently, there are no ports for the service endpoint but the port is encoded in the SNI. This has some limitations

  1. If the SNI is made optional and we enable plaintext communications, currently there is no other way to provide the ports. We would need to use metadata (labels), which is not the best way to overcome a limitation of current data model.
  2. Even with the SNI, a service owner can provide only one port per service. If a service is exposed in more than one port, the owner has to provide multiple FederatedService entries (with different SNIs), and the consumer has to figure out how to assemble them in the same local service entry if it can at all. Again, his is a limitation of the current data model.

Alternatives

  • With the current spec, only TLS communication is possible. There is no alternative. It is a limitation in the spec that I would like to remove.
  • With the current spec, assuming two services on different ports exposed by the same owner,
    there is no other way for a consumer than having two endpoints/entries and append the port to the name of the service to be able to differentiate them. This is not the best UX.
  • There can be cases where implementors (like Consul) where there are no explicit ports for services. In this case, its implementation would have to report "fake" port numbers to consumer meshes that have no real meaning in terms of ports anything is listening on (again, ports in Consul are random, non-predictable).

The proposed changes allow a compact and simple representation of remote service descriptions for

  • mTLS. SNI routing at the owner ingress
  • Cleartext. L7 routing (host based) at the owner ingress
  • Cleartext. L4 routing (ip/port based) at the owner ingress
  • Multi-protocol services, i.e. DNS which may listen for TCP and UDP on the same port of the same service

Change the name of the ServiceID field to SNI.

+1 to the idea, but would changing the name of the field be disruptive to implementations that want to upgrade from v1alpha1 to v1alpha2?

Make SNI optional so we can enable plaintext communications.

I think the more likely reason here is if a Hamlet implementation operates at L4-only and defers to the consumer to know the domain / SNI header of the service.

@sergiopozoh what's the relationship between the new Port field and the port in Endpoint? Do all Ports map to all Endpoints?

+1 to the idea, but would changing the name of the field be disruptive to implementations that want to upgrade from v1alpha1 to v1alpha2?

yes, all changes here are disruptive. For an alpha, I wasn't considering changing the major version, but we can do it for the sake of clarity. I assume that the underlying reason for the question. Was it?

Proposed changes (new revision)

  • Assuming Name is the primary key
  • Change the name of the ServiceID field to SNI.
  • Make SNI optional so we can enable plaintext communications.
  • Add service endpoint ports. The goal is to be able to specify ports, one or more protocol per port, and an SNI per pair of [port, protocol] (mandatory if protocol==TLS). The SNI is still opaque and will encode the SNI header by the consumer SM to a particular owner SM. The proposed format is now

Existing Endpoint type, with a new field EndpointSelector

Endpoint {
    string Address // External address associated with the network endpoint where the service is available. Valid values are host IP addresses and FQDN.
    Port // External port associated with the network endpoint where the service is available.
    [optional] array string EndpointLabels // Decouples IP from routing information. Matches with Port descriptions via Port.EndpointSelector.
}

New type representing the service instance information required for routing at the ingress of the service owner, when there are multiple backing instances (for example in different ports or with different protocols).

ServiceSelector {
    int Number // A valid non-negative integer port number.
    string Protocol // Protocol exposed on the port. MUST BE one of HTTP|HTTPS|GRPC|HTTP2|MONGO|TCP|TLS. TLS implies the connection will be routed based on the SNI header to the destination.
    [optional] string Host // FQDN. Not required if using mTLS/SNI, only required for cleartext host-based routing
    [optional] string SNI // If protocol==TLS, the value of this field will be set as the SNI header by the federated service mesh consumer when creating a connection to the owner and has been originally set by the owner. Each vendor may possibly have its own SNI format, so this specification doesn't define a particular format to use for this field. This field is opaque for the consumer, it simply uses it.
    [optional] array string EndpointSelector // Matches with one or more Endpoint via Endpoint.EndpointLabels.
}

The previous spec is leaking port information from the service owner. In fact, there is no need to share port information, but only for the service provider to simply export an identifier for the backing services for a FederatedService. This identifier has to be unique within that context. In this way, we are creating a hierarchy in a way that a FederatedService can contain multiple backing instances under the same FQDN. For flexibility, each instance can be exposed in a different ingress, or on the same ingress but in a different port.

This is the new alternative

Existing Endpoint type, with a new field EndpointLabels

Array Endpoint {
    string Address // External address associated with the network endpoint where the service is available. Valid values are host IP addresses and FQDN.
    Port // External port associated with the network endpoint where the service is available.
    [optional] array string EndpointLabels // Decouples IP from routing information. FederatedServiceInstance.spec.EndpointSelector matches with at least one Endpoint.spec.EndpointLabel here
}

New type representing the service instance information required for routing at the ingress of the service owner, when there are multiple backing instances. FederatedService will have at least one FederatedServiceInstance

Array FederatedServiceInstance {
    string instanceID // A unique identifier for the service instance within the group of instances for a FederatedService
    string Protocol // Protocol. MUST BE one of HTTP|HTTPS|GRPC|HTTP2|MONGO|TCP|TLS. TLS implies the connection will be routed based on the SNI header to the destination.
    [optional] string SNI // If protocol==TLS, the value of this field will be set as the SNI header by the federated service mesh consumer when creating a connection to the owner and has been originally set by the owner. Each vendor may possibly have its own SNI format, so this specification doesn't define a particular format to use for this field. This field is opaque for the consumer, it simply uses it.
    [optional] array string EndpointSelector // Matches with one or more Endpoint via Endpoint.EndpointLabels.
}

For a consumer agent compliant implementation, the agent must generate in the local consumer platform a FQDN for each of the instances with the format FederatedServiceInstance.spec.instanceID+FederatedService.spec.fqdn

In addition, for a consumer agent compliant implementation, the agent must generate in the local consumer platform an entry with FederatedService.spec.fqdn and all the backing FederatedServiceInstance.

Alternatives

  • This proposal is only a cosmetic change. This behavior can already be implemented with v1alpha1. Different instances may publish a different FederatedService, and the consumer may collapse them together based on the FQDN into a single logical service. Each of these instances will have a different Name.