ShubhamTatvamasi/magma-galaxy

failure with multiple NICs in server

Closed this issue · 10 comments

If there is more than one NIC in the system, the playbook always picks the first one listed by hostname -I. That can create a problem if the public IP is not assigned to the first NIC. Offending line in deploy-orc8r.sh is ORC8R_IP=$(hostname -I | awk '{print $1}'). This problem manifested with the k8s load balancer being assigned to the wrong IP.

Thanks for pointing out this issue @jblakley. I can think of a solution:

  1. User have 1 NIC.
    • Then select the default IP and move forward.
  2. User have multiple NICs.
    • We can list all the IPs and ask user to select one from the list.

Do you think this will solve the issue?

That is certainly an option. Jan Harkes suggested a couple of ways to detect a NIC that has a public IP:

ip -o route get 8.8.8.8 | awk '{print $7}' or ip -j route get 8.8.8.8 | yq e '.[0]["prefsrc"]' -

These might create issues if there was no public IP assigned or there was more than one NIC with public IP.

Hi @jblakley, for fixing this issue if we use ip -o route get 8.8.8.8 | awk '{print $7} instead on hostname -I | awk '{print $1}', will it fix the issue for us?
We are not dependent on public IP, it will also work on local IP. Only thing we need is internet. Multiple NICs is also not an issue, as we can go with the default route interface for selecting the IP. What do you suggest?

@ShubhamTatvamasi
It works in our case but because only one interface has internet access and that's the one we want.

In general, in the case with multiple interfaces, I think you're trying to ID the one where the orc8r services (e.g., nms) will ultimately be listening. In our case, that's the one with the public IP. However, in the cases with multiple NICs all with private IPs where those services will be listening on one of the private IP, I'm not sure how'd you do it without requiring the person deploying to pick one.

All I can think is that interface should be reachable by 1) all the AGW's to access the controller, bootstrapper, etc. and 2) by all clients who want to access nms. I can't think of a good way to automate that.

FYI, on our environment, hostname -I returns 4 IP addresses. Only the second is public and reachable by all clients and AGWs. The default route is to the IP of a gateway, not the orc8r.

If DNS is already configured, you could use nslookup to do a reverse lookup. I got this to work with:
nslookup <FQDN>|grep Address|grep -v 127.0.0.53|awk '{print $2}'

But that's pretty kludgy.

We can always deploy it manually by specifying the IP in our hosts.yml file.

That's probably the most straightforward approach. Your original proposal would also work.

Are you talking about #7 (comment), where user can select the IP from the list?

Yes.

Hi @jblakley, I have updated the code to use the IP from default gateway interface and user can also edit it before the deployment starts. Please see if that resolves this issue. Thanks