canonical/microcloud

microcloud init: failed to bind to any multicast udp port

Opened this issue · 20 comments

Hello,
I try to set up MicroCloud with 3 virtual machines. Every machine has 2 network interfaces: one is assigned an IP address, the other is without IP address:
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:1a:4a:16:01:96 brd ff:ff:ff:ff:ff:ff
altname enp0s3
3: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:1a:4a:16:01:98 brd ff:ff:ff:ff:ff:ff
altname enp0s9
inet 10.216.50.200/24 brd 10.216.50.255 scope global noprefixroute ens9
valid_lft forever preferred_lft forever

What's going wrong?
Kind regards
Margit

Hi @myr4htw, can you please provide some more context and reproducer steps. At which point of the microcloud init process do you see this error?

Hi,
I am now trying microcloud init with the following preseed file:

lookup_subnet: 10.216.50.0/24
systems:

  • name: stl-s-microcl1
    ovn_uplink_interface: ens3
    disks:
    local:
    • path: /dev/sdb
      wipe: false
  • name: stl-s-microcl2
    ovn_uplink_interface: ens3
    disks:
    local:
    • path: /dev/sdb
      wipe: false
  • name: stl-s-microcl3
    ovn_uplink_interface: ens3
    disks:
    local:
    • path: /dev/sdb
      wipe: false
      ovn:
      ipv4_gateway: 10.216.50.1/24
      ipv4_range: 10.216.50.200-10.216.50.204
      network:
      ethernets:
      ens3:
      dhcp4: false
      dhcp6: false
      ens9:
      addresses:
      - 10.216.50.0/24
      dhcp4: false
      dhcp6: false

I receive the error
"Scanning for eligible servers ...
Error: Failed lookup: write udp4 0.0.0.0:50381->224.0.0.251:5353: sendto: network is unreachable"

But multicast seems to be ok!?

Please see the output of the commands netstat -gn and nc:

root@stl-s-microcl1:~# netstat -gn
IPv6/IPv4-Gruppenmitgliedschaften
Schnittstelle RefZäh Grupp


lo 1 224.0.0.251
lo 1 224.0.0.1
ens3 1 224.0.0.1
ens9 2 224.0.0.251
ens9 1 224.0.0.1
lo 1 ff02::fb
lo 1 ff02::1
lo 1 ff01::1
ens3 1 ff02::1
ens3 1 ff01::1
ens9 2 ff02::fb
ens9 1 ff02::1:fff8:8ae0
ens9 1 ff02::1
ens9 1 ff01::1
.... same result on second server (microcl2) and third server (microcl3)

Test connection to the second server (microcl2):
root@stl-s-microcl1:~# nc -u -v 10.216.50.202 5353
Connection to 10.216.50.202 5353 port [udp/mdns] succeeded!

Test connection to the third server (microcl3):
root@stl-s-microcl1:~# nc -u -v 10.216.50.204 5353
Connection to 10.216.50.204 5353 port [udp/mdns] succeeded!

Hope that helps.

Kind regards
Margit

Make sure multicast is enabled on your network. MicroCloud uses mDNS for discovery and in your case tries to send to the multicast address 224.0.0.251:5353 to discover it's peers. See the comment about cloud providers here: https://canonical-microcloud.readthedocs-hosted.com/en/latest/explanation/initialisation/#automatic-server-detection.

There is also this issue which looks to be the same #134.

Hi,
problem is solved. It was a network issue We had to add a router for our internal network.
Now setup is complete, but I can't ping the virtual router.
Network config of our servers looks strange with the interfaces in state down.....

micro1@stl-s-microcl1:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:1a:4a:16:01:96 brd ff:ff:ff:ff:ff:ff
altname enp0s3
inet 10.216.50.200/24 brd 10.216.50.255 scope global noprefixroute ens3
valid_lft forever preferred_lft forever
inet6 fe80::e42:9822:698:6b55/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
link/ether 00:1a:4a:16:01:98 brd ff:ff:ff:ff:ff:ff
altname enp0s9
4: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 4a:e1:c9:bb:5a:79 brd ff:ff:ff:ff:ff:ff
5: lxdovn1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 00:1a:4a:16:01:98 brd ff:ff:ff:ff:ff:ff
6: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 6a:21:1e:63:93:6c brd ff:ff:ff:ff:ff:ff
7: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
link/ether ba:e9:58:06:65:cf brd ff:ff:ff:ff:ff:ff
inet6 fe80::7857:e1ff:fe4a:618b/64 scope link
valid_lft forever preferred_lft forever

Do you have any ideas?

Kind regards
Margit

I guess you are trying to ping the OVN virtual router from one of the LXD networks e.g. default? From the MicroCloud cluster nodes there is no route to this network. You can confirm this by getting the output of ip r on one of the cluster nodes and check that the LXD networks range (ipv4.address) from lxc network show default isn't in there.

Can you reach the virtual router/gateway from an instance connected to the respective LXD network?

  • lxc launch ubuntu:jammy c1 -n {network}
  • lxc exec c1 -- ping {gateway}

Hi,
unfortunately I don't understand your statement :-(
Please see my config. Does this look correct?

Kind regards
Margit
micro1@stl-s-microcl1:/etc/default$ ip r
default via 10.216.50.152 dev ens3 proto static metric 20100
10.216.50.0/24 dev ens3 proto kernel scope link src 10.216.50.200 metric 100
169.254.0.0/16 dev ens3 scope link metric 1000
micro1@stl-s-microcl1:/etc/default$ lxc network show default
config:
bridge.mtu: "1442"
ipv4.address: 10.183.27.1/24
ipv4.nat: "true"
ipv6.address: fd42:a418:68ab:5d4d::1/64
ipv6.nat: "true"
network: UPLINK
volatile.network.ipv4.address: 134.96.216.206
description: ""
name: default
type: ovn
used_by:

  • /1.0/profiles/default
    managed: true
    status: Created
    locations:
  • stl-s-microcl1
  • stl-s-microcl2
  • stl-s-microcl3

Could you please give me a description how I can test the network and how to see if it is ok or not?
Is there any documentation describing the relations?

In the output of ip r you can see there is no route to reach the virtual network 10.183.27.0/24 (default) in which 10.183.27.1/24 is the gateway as seen from LXD instances. Egress traffic from this network has the source IP 134.96.216.206 which comes from the range of addresses that you have specified during MicroCloud installation.

Now if you want to ping the public facing side of the virtual network you could try to ping 134.96.216.206. Make sure the network this address resides in is properly routed in your infrastructure.

From your messages I still don't see the exact error you are facing, please elaborate on this.

Hi,

134.96.216.206 is not reachable by ping.
I think the main problem is that the MicroCloud is not reachable - neither internally nor from the internet.
I will try to tell you the steps we have taken to build the MicroCloud. Perhaps you then see a mistake.

We've got 3 ubuntu servers. These are virtual machines in our RedHat Virtualization.
Every server has 2 network interfaces. One interface (ens3) has an IP (10.216.50.200, 10.216.50.202, 10.216.50.204). The second interface has no IP (ens9), it is connected to the network 134.96.216.0 which is our network with internet connection.

micro1@stl-s-microcl1:/etc/default$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:1a:4a:16:01:96 brd ff:ff:ff:ff:ff:ff
altname enp0s3
inet 10.216.50.200/24 brd 10.216.50.255 scope global noprefixroute ens3
valid_lft forever preferred_lft forever
inet6 fe80::e42:9822:698:6b55/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
link/ether 00:1a:4a:16:01:98 brd ff:ff:ff:ff:ff:ff
altname enp0s9
4: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 4a:e1:c9:bb:5a:79 brd ff:ff:ff:ff:ff:ff
5: lxdovn1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 00:1a:4a:16:01:98 brd ff:ff:ff:ff:ff:ff
6: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 6a:21:1e:63:93:6c brd ff:ff:ff:ff:ff:ff
7: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
link/ether ba:e9:58:06:65:cf brd ff:ff:ff:ff:ff:ff
inet6 fe80::7857:e1ff:fe4a:618b/64 scope link
valid_lft forever preferred_lft forever

In the MicroCloud init process we selected the addresses 10.216.50.200 - 204 for internal traffic
and the interfaces ens9 for external connectivity.
IPv4 gateway: 134.96.216.200/24
first IP address in the range... 134.96.216.206
last IP address in the range... 134.96.216.208

Then the cluster was built successfully.
Perhaps these informations tell you what is going wrong or what I still have to do to get a working MicroCloud.

Kind regards
Margit

Okay so from what I can see this configuration looks ok.

Maybe let's first check it the other way around. When you deploy a LXD instance within the MicroCloud, can it reach the internet and/or gateway?

lxc launch ubuntu:jammy c1
lxc exec c1 -- bash -c "apt update && apt install -y traceroute && traceroute 1.1.1.1"

This should print something like this:

traceroute to 1.1.1.1 (1.1.1.1), 30 hops max, 60 byte packets
 1  _gateway ({gateway of your LXD default network})  0.787 ms  0.822 ms  0.839 ms
 2  _gateway.lxd (134.96.216.200)  2.163 ms  2.164 ms  2.161 ms
...
10  one.one.one.one (1.1.1.1)  28.644 ms  23.253 ms  23.247 ms

Hi,
creation of an instance doesn't work, please see:

micro1@stl-s-microcl1:~$ lxc launch ubuntu:jammy c1
Creating c1
Error: Failed instance creation: Failed getting image: Failed parsing stream: Get "https://cloud-images.ubuntu.com/releases/streams/v1/index.json": lookup cloud-images.ubuntu.com on 127.0.0.53:53: server misbehaving

This looks to be an issue related to domain name resolution on your machine and not LXD.
What happens when you nslookup cloud-images.ubuntu.com?

root@stl-s-microcl1:~# nslookup cloud-images.ubuntu.com
Server: 127.0.0.53
Address: 127.0.0.53#53

** server can't find cloud-images.ubuntu.com: SERVFAIL
--> ok, there is no nameserver on localhost.
And when I add our nameserver (e.g. 134.96.216.214) to /etc/resolv.conf it cannot be contacted ---> no internet connection!
--> I am moving in a circle .... ???

Hello,
perhaps I can ask my question like that:
Which network / routing requirements must be met in order MicroCloud to work?
We' d like to use network 10.216.50.0/24 (internal network without internet connection) for internal communication and addresses from network 134.96.216.0/24 (this network has internet connection) for the uplink network.

Kind regards
Margit

You can find the networking requirements listed on this page https://canonical-microcloud.readthedocs-hosted.com/en/latest/explanation/microcloud/#explanation-networking.

Have you managed to get DNS working?

No, the problem is the lack of internet connection...

When launching an instance, downloading the image would be done through 10.216.50.0/24 and not the OVN network. Can you nslookup cloud-images.ubuntu.com from the host where LXD/MicroCloud is running on?

No, I can't.
I have no internet connection from network 10.216.50.1 - I think that is the main problem.
I thought the microcloud-init process would establish the internet connection via OVN-router?

That seems to be a problem yes. Please check the link I have posted earlier. For the creation of the MicroCloud/LXD cluster and the download of images the first network interface is used.

The second network interface is dedicated for connecting OVN to the uplink network. That is egress traffic originating from LXD instances deployed in MicroCloud.

The network the first interfaces belongs to has no internet connection. My understanding was it is only used for the communication between the MicroCloud Servers. That is why we chose this internal network.
What does it mean now? What do we have to do to solve this problem?