Support auto-assign of IP address for zerotier_member
rsyring opened this issue · 10 comments
I've been able to put together Terraform scripts modeled after the multicloud quickstart. One improvement I'd like to be able to make is for the member to get auto-assigned an IP address. But, with the ability to also then use that IP address in other places in the scripts, like assigning a DNS name.
Right now, the two options seem to be:
- Manually assign an IP. Pro: can do all steps in Terraform
- Auto-assign an IP. Have to wait for the AWS instance to start up and join the network before an IP is assigned (I think?). By that time
zerotier_member
has already ran and it'sip_assignments
is empty.
The option I'm looking for is: zerotier_member
gains the ability to force auto-assignment of an IP address from the network pool before it returns. That IP can then be used in other places like DNS.
If this already works somehow and I've missed it, please let me know.
Thanks.
this might be hard to do in advance but I'll try to take a peek soon at how to accomplish it.
Thanks. The other option I was considering was some kind of delay/repeat on zerotier_member
that would result in the resource not returning until it read an API result with a value. Maybe something like:
resource "zerotier_member" "this" {
name = var.name
member_id = zerotier_identity.this.id
network_id = var.zt_network_id
ip_assignments_wait {
# this is essentially a shortcut for depends_on.
for = aws_instance.this
# retry the API call every five seconds
read_every = "5s"
# abort if ip assignments still not present after 120 seconds
timeout = "120s"
}
}
Could combine with depends_on and the aws_instance
to take some pressure off the API. That way the repeating read wouldn't start hitting the API until the instance was created.
This would be difficult without some core changes to the network controllers. The terraform provider allows you to create the member records in advance whether the node is connected or trying to connect to the network right now, or completely offline. As it is now with auto-assign, the controllers won't assign an IP until the members are actively trying to join the network and have also been authorized on the network. The solution proposed above would always timeout unless the member was actively trying to join the network.
@glimberg I was worried about this as well; but we could always use the same identity calculation method that zt-one uses can't we? I can dig up the code if you're not sure what I'm talking about; we'd at least avoid most collisions this way.
@erikh Yeah that should be mostly possible. Might get tricky if/when a collision happens, but terraform should be able to figure out when that happens.
yeah, I think it might be workable if we did it that way. Ok.
@rsyring I don't think I'm going to get to this soon. Just fair warning there's a lot on my plate right now. If you wanted to supply a patch, I can link you to the C++ sources we're discussing so you could port this yourself if you desired.
The solution proposed above would always timeout unless the member was actively trying to join the network.
The ip_assignments_wait
proposal was with the consideration that the aws_instance
would be getting created at the same time and that joining the zerotier network is part of the instance creation initialization. Here is a fuller example of what I had in mind:
resource "zerotier_identity" "this" {}
resource "zerotier_member" "this" {
name = var.name
member_id = zerotier_identity.this.id
network_id = var.zt_network_id
ip_assignments_wait {
# wait on the aws instance to be completed before making API calls
for = aws_instance.this
# retry the API call every five seconds
read_every = "5s"
# abort if ip assignments still not present after 120 seconds
timeout = "120s"
}
}
resource "aws_route53_record" "zerotier" {
zone_id = var.route53_zone_id
type = "A"
ttl = var.dns_ttl
name = "${var.name}.zt"
records = [
zerotier_member.this.ip_assignments[0]
]
}
resource "aws_instance" "this" {
tags = { Name = var.name }
ami = data.aws_ami.this.id
instance_type = var.instance_type
key_name = var.key_name
source_dest_check = false
subnet_id = var.subnet_id
vpc_security_group_ids = var.security_group_ids
user_data = data.cloudinit_config.this.rendered
}
# Similar setup script to the multi-cloud example. Installs the identity files, installs
# zerotier-one, and joins the network.
data "cloudinit_config" "this" {
gzip = true
base64_encode = true
part {
filename = "init.sh"
content_type = "text/x-shellscript"
content = templatefile("${path.module}/init-server.tpl", {
"hostname" = var.name
"zt_identity" = zerotier_identity.this
"zt_network_id" = var.zt_network_id
})
}
}
@rsyring I don't think I'm going to get to this soon. Just fair warning there's a lot on my plate right now. If you wanted to supply a patch, I can link you to the C++ sources we're discussing so you could port this yourself if you desired.
I'm actually not sure how the "identity calculation method" applies here. Are you suggesting a change to the network controllers or the provider? If it's the controllers and that's C++, I probably can't help much. I haven't touched C++ for 20 years. :)
I might have a better chance at modifying the provider to add the wait logic if you think that's an ok stop gap in the mean time. If not, no big deal. No expectations on timeline, I can make the manual assignment work for now. Thanks for the conversation.
I'm actually not sure how the "identity calculation method" applies here. Are you suggesting a change to the network controllers or the provider?
The IP address assigned by the controller is derived from the node's identity. If you join multiple networks at once, you may notice the last octet is the same on each network. That's why.
The ip_assignments_wait proposal was with the consideration that the aws_instance would be getting created at the same time and that joining the zerotier network is part of the instance creation initialization. Here is a fuller example of what I had in mind:
I get that. It's just without support at the controller level for auto assigning IPs for offline nodes, errors & timeouts will happen for people. Someone will definitely try to use it without bringing up nodes simultaneously and open tickets when there's an error. Better to head off those issues before they arise if possible.
FWIW, I had similar issues with the tailscale provider (I'm evaluating Tailscale vs ZT) and this is what they came up with: https://github.com/davidsbond/terraform-provider-tailscale/pull/72
Actually, it would be possible to just have the terraform provider assign a random unused IP within the auto-assign pool of the network if there's not already an IP address provided. The address doesn't need to be derived from anything in particular. It just can't already be used on the network. Theoretically, terraform knows the IPs it's already assigned and has access to the Central API to find out what IPs have been assigned as well.
This would get around having to wait for the controller to assign the IP automatically, and solves the issue with devices that will remain offline until $future_time.
@erikh @someara This may be a good way forward to handle this one.