TritonDataCenter/triton-cns

Better Support for SRV Records

Closed this issue · 8 comments

Currently from what I've read you can only set "weight" and "priority".

While attempting to use HAProxy's server-template via DNS, it assumes all SRV records following the following standard:

_service._proto.name. TTL class SRV priority weight port target

Where:

_service Standard network service name (taken from /etc/services) or a port number
_proto Standard protocol name ("tcp" or "udp")
name Name of the service, i.e. the name used in the query
TTL Validity period for the response (HAProxy ignores this field because it maintains its own expiry data defined in the configuration)
class DNS class ("IN")
SRV DNS record type ("SRV")
priority Priority of the target host. Lower value = higher preference (HAProxy ignores this field but may use it later to indicate active / backup state)
weight Relative weight in case of records with the same priority. Higher number = higher preference
port Port where the service is configured
target Hostname of the machine providing the service, ending in a dot

I think CNS should expose records in this formatting when users set any of the configurable properties.

Port, Weight and Priority are supported today via foo:1234:priority=20:weight=20. (https://github.com/joyent/node-triton-tags/blob/master/lib/index.js#L124)

I imagine adding the ability to set a TTL, Service, and Proto should be achievable. Enforcing the values in node-triton-tags could be inspired by RFC 2782

Service
The symbolic name of the desired service, as defined in Assigned
Numbers [STD 2] or locally. An underscore (_) is prepended to
the service identifier to avoid collisions with DNS labels that
occur in nature. Some widely used services, notably POP, don't have a single
universal name. If Assigned Numbers names the service
indicated, that name is the only name which is legal for SRV
lookups. The Service is case insensitive.

Proto
The symbolic name of the desired protocol, with an underscore
(_) prepended to prevent collisions with DNS labels that occur
in nature. _TCP and _UDP are at present the most useful values
for this field, though any name defined by Assigned Numbers or
locally may be used (as for Service). The Proto is case
insensitive.

TTL
Standard DNS meaning RFC 1035.

An example record

triton inst get example | jq .dns_names                                                                                                                                                               
  "_my-tcp-service._tcp.svc.my-account.my-datacenter.example.com"

I think the way I would implement this is to lighten the restriction on using _ and . in the tag value so that you could express it as triton.cns.services=_ldap._tcp:389. Something like if the tag value matches the regex /_[a-z0-9]+._(tcp|udp):[0-9]+/ then we'll special case it and allow it.

Then in cns-server, if it matches that regex we'll produce a properly formatted record for it. We're mostly concerned with the name:port (because the port is what triggers the SRV record). Behavior of priority and weight should be unchanged.

Overriding or specifying the TTL isn't something we want with CNS. It's 30s for everything because when an instance drops out, for whatever reason, we want the record to be withdrawn in a short amount of time. You don't want a dead instance sticking around in your SRV record for an hour. It would also be problematic because the TTL value is for the resource (i.e., name), not the record (i.e., value). You'd run into a conflict if two different zones had the same tag value and specified different TTLs.

After discussing with @bahamat it look's like the place to implement this is actually in the peg.js file located in node-triton-tags at https://github.com/joyent/node-triton-tags/blob/master/lib/cns-svc-tag.pegjs#L19.

From:

dnslabel "DNS name" = [a-zA-Z0-9] [a-zA-Z0-9-.]*
{
    return (text().toLowerCase());
}

To:

dnslabel "DNS name" = ( [a-zA-Z0-9-] [a-zA-Z0-9-.]* / servicelabel )
{
    return (text().toLowerCase());
}
servicelabel "SRV name" = "_" [a-zA-Z0-9-]* ( "._tcp" / "._udp" )

Since VMAPI and CNS both use this, this should allow for the intended behavior.

Created TritonDataCenter/node-triton-tags#6 which should fix this. Will also need to update CNS to use triton-tags 1.4.0.

Copying this from the node-triton-tags PR:

FWIW, This was not how SRV records were originally designed to work with CNS -- the original design was the the CNS names were used as targets of a CNAME.

So, the assumption was (at least on the public cloud), most people would not use the actual bare CNS name for anything. Instead they would be running their own domain somewhere else and would CNAME/DNAME things to the full CNS name with UUIDs.

Thus, you'd have _foo._tcp.thing.com CNAME'd to uuid.svc.uuid.dc.triton.host, so CNS generates the SRV records on that name without any _foo._tcp involved.

Allowing this kind of name mostly makes sense for private cloud installs where use_login/use_alias are on and people are using the CNS names directly. Just thought I'd note that this isn't really a bug fix, so much as a change in design centre.

Making the haproxy template work easily in private cloud setups seems like a reasonable justification for the change to me.

@arekinath Yeah, I went back and forth for a while. But given that Triton’s focus is primarily private cloud installs I figured it’s probably better off with than without.

Having installed all of the experimental builds:

$ triton inst tag set ldap0 triton.cns.services=ldap:389,_ldap._tcp:389     
{
    "triton.cns.services": "ldap:389,_ldap._tcp:389"
}
triton inst tag set ldap1 triton.cns.services=ldap:389,_ldap._tcp:389
{
    "triton.cns.services": "ldap:389,_ldap._tcp:389"
}

And checking it:

$ triton inst get ldap0 | json tags dns_names                          
{
  "triton.cns.services": "ldap:389,_ldap._tcp:389"
}
[
  "b28e6671-b2d4-cc5d-b596-e6f0811cdb0b.inst.f7b580d5-e282-4a9a-f581-9b4caa4c6cd3.barovia.cns.digitalelf.net",
  "b28e6671-b2d4-cc5d-b596-e6f0811cdb0b.inst.bbennett.barovia.cns.digitalelf.net",
  "ldap0.inst.f7b580d5-e282-4a9a-f581-9b4caa4c6cd3.barovia.cns.digitalelf.net",
  "ldap0.inst.bbennett.barovia.cns.digitalelf.net",
  "ldap.svc.f7b580d5-e282-4a9a-f581-9b4caa4c6cd3.barovia.cns.digitalelf.net",
  "ldap.svc.bbennett.barovia.cns.digitalelf.net",
  "_ldap._tcp.svc.f7b580d5-e282-4a9a-f581-9b4caa4c6cd3.barovia.cns.digitalelf.net",
  "_ldap._tcp.svc.bbennett.barovia.cns.digitalelf.net"
]
$ dig +noall +question +answer -t srv _ldap._tcp.svc.bbennett.barovia.cns.digitalelf.net
;_ldap._tcp.svc.bbennett.barovia.cns.digitalelf.net. IN SRV
_ldap._tcp.svc.bbennett.barovia.cns.digitalelf.net. 9 IN SRV 0 10 389 ldap1.inst.bbennett.barovia.cns.digitalelf.net.
_ldap._tcp.svc.bbennett.barovia.cns.digitalelf.net. 9 IN SRV 0 10 389 ldap0.inst.bbennett.barovia.cns.digitalelf.net.

All related PRs have been merged.