antrea-io/antrea

Decouple kubelet node-ip from the Antrea default Gateway uplink

jayunit100 opened this issue · 20 comments

Describe the bug

TLDR Windows node logic for selecting default gateways seems to be a little too highly specified with the "assume the default gateway is on the same network as the kubelet node IP". Can we put some kinda knob in here?...

Concretely : I want our windows development environments to work on a private network, where im not gauranteed a default route attached to the node-ip (vagrant doesnt make it easy to do this).

However, antrea-agent mandates that a default route exists on the kubelet's node-ip.

  • in antrea, the prepareHostNetwork function is called before setting an OVS bridge up. The idea here is that an uplink needs to be setup before we can make an OVS bridge.
  • To get that uplink antrea needs to introspect the system for a default gateway device.
  • This is done by callingGetIPNetDeviceFromIP (in pkg/agent/util/net.go)
  • Antrea runs a fancy powershell CIM query, lookin for a default gateway on the node ip.
    • In cases where you DONT have a default gateway on the kubelet's Private IP (i.e. this happens in virtualbox, where you need a private_ip for kube automation scripts, but windows uses another vbox nat'd (or bridged?) network by default for other traffic). This fails....
  • antrea-agent fails in startup, and restarts indefinetly, meaning that you cant create pods bc any call to the CNI will fall flat.

ultimately the failure happens at this choke point:

   defaultGW, err := util.GetDefaultGatewayByInterfaceIndex(adapter.Index)

Can we make it so that antrea is able to just use ANY default gateway we provide as input, rather then relying on the guess provided by the _, adapter, err := util.GetIPNetDeviceFromIP(i.nodeConfig.NodeIPAddr.IP) ?

Looking at https://docs.projectcalico.org/reference/felix/configuration, theres a DeviceRouteSourceAddress which you can select. But in the code for antrea, we seem to guess the route source address based on the node-ip of the kubelet.

For details, this is what you ultimately get in antrea-agent if it the node-ip for the windows kubelet has no default gateway interface...

F0702 22:02:30.875042   84440 main.go:58] Error running agent: error initializing agent: stderr non empty for command '$(Get-NetRoute -InterfaceIndex 26 -DestinationPrefix 0.0.0.0/0 ).NextHop': Get-NetRoute
...
No matching MSFT_NetRoute objects found by CIM query for instances of the
ROOT/StandardCimv2/MSFT_NetRoute class on the  CIM server: SELECT * FROM MSFT_NetRoute  WHERE ((DestinationPrefix LIKE
'0.0.0.0/0')) AND ((InterfaceIndex = 26)). Verify query parameters and retry.

To Reproduce
install virtualbox and then try the windows development recipes... git clone https://github.com/kubernetes-sigs/sig-windows-dev-tools/ ; make all

Expected
Antrea agent would give a clear indication that we're missing an interface as opposed to just dumping the query info.

Actual behavior
Antrea agent dumps the Get-NetRoute query and just dies.

Versions:
Please provide the following information:

  • Antrea version (Docker image tag). 0.13.2

after thinking more about this, i suspect maybe its related to the default ethernet device being wrong ?

this is the vbox interface....

Ethernet adapter Ethernet:

   Connection-specific DNS Suffix  . :
   Link-local IPv6 Address . . . . . : fe80::1d82:8754:1a6b:d40b%7
   IPv4 Address. . . . . . . . . . . : 10.0.2.15
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Default Gateway . . . . . . . . . : 10.0.2.2

but the network for my kubelets is :

Ethernet adapter Ethernet 2:

   Connection-specific DNS Suffix  . :
   Link-local IPv6 Address . . . . . : fe80::1820:ee2a:d2fc:8afd%6
   IPv4 Address. . . . . . . . . . . : 10.20.30.11
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Default Gateway . . . . . . . . . :

using 0.0.0.0/0 as the interface... is that required ? or should that be configurable

Ive updated my virtualmachines to make sure they use the right node-ip but i still get this error, because i think the ifindex 6 isnt targetted to 0.0.0.0....

@antoninbas i see that you wrote the docs with the spec that the way the interface is picked is by looking up the 0.0.0.0/0 interface https://github.com/antrea-io/antrea/blob/main/docs/design/windows-design.md .

Is that 0.0.0.0/0 interface destination prefix explicitly required for the uplink? or is that just a hueristic.

➜  sig-windows-dev-tools git:(antrea-node-ip-hardcoding) ✗ vagrant winrm winw1 --shell=powershell --command="Get-NetRoute -InterfaceIndex 6"

ifIndex DestinationPrefix                              NextHop                                  RouteMetric ifMetric PolicyStore
------- -----------------                              -------                                  ----------- -------- -----------
6       255.255.255.255/32                             0.0.0.0                                          256 25       ActiveStore
6       224.0.0.0/4                                    0.0.0.0                                          256 25       ActiveStore
6       10.20.30.255/32                                0.0.0.0                                          256 25       ActiveStore
6       10.20.30.11/32                                 0.0.0.0                                          256 25       ActiveStore
6       10.20.30.0/24                                  0.0.0.0                                          256 25       ActiveStore
6       ff00::/8                                       ::                                               256 25       ActiveStore
6       fe80::69a1:886:b45:fd6f/128                    ::                                               256 25       ActiveStore
6       fe80::/64                                      ::                                               256 25       ActiveStore

(note ive modified this question since my original post, as ive played more with it to figure some things out)

@lzhecheng any thoughts on this windows IP / gateway configuration issue ?

btw, from a working cluster,

PS C:\Users\capv> Get-NetRoute                                                                                                                                                                                                                                                                                              
                                                                                                                                                                                                                                                                                                                            
ifIndex DestinationPrefix                              NextHop                                  RouteMetric ifMetric PolicyStore                                                                                                                                                                                            
------- -----------------                              -------                                  ----------- -------- -----------                                                                                                                                                                                            
25      255.255.255.255/32                             0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
13      255.255.255.255/32                             0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
4       255.255.255.255/32                             0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
8       255.255.255.255/32                             0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
1       255.255.255.255/32                             0.0.0.0                                          256 75       ActiveStore                                                                                                                                                                                            
25      224.0.0.0/4                                    0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
13      224.0.0.0/4                                    0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
4       224.0.0.0/4                                    0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
8       224.0.0.0/4                                    0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
1       224.0.0.0/4                                    0.0.0.0                                          256 75       ActiveStore                                                                                                                                                                                            
8       169.254.255.255/32                             0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
8       169.254.253.54/32                              0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
8       169.254.0.0/16                                 0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
1       127.255.255.255/32                             0.0.0.0                                          256 75       ActiveStore                                                                                                                                                                                            
1       127.0.0.1/32                                   0.0.0.0                                          256 75       ActiveStore                                                                                                                                                                                            
1       127.0.0.0/8                                    0.0.0.0                                          256 75       ActiveStore                                                                                                                                                                                            
4       100.255.255.255/32                             0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
25      100.96.4.0/24                                  100.96.4.1                                       256 15       ActiveStore                                                                                                                                                                                            
25      100.96.3.255/32                                0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
25      100.96.3.1/32                                  0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
25      100.96.3.0/24                                  0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
25      100.96.2.0/24                                  100.96.2.1                                       256 15       ActiveStore                                                                                                                                                                                            
25      100.96.1.0/24                                  100.96.1.1                                       256 15       ActiveStore                                                                                                                                                                                            
25      100.96.0.0/24                                  100.96.0.1                                       256 15       ActiveStore                                                                                                                                                                                            
4       100.70.143.158/32                              0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
4       100.68.198.134/32                              0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
4       100.68.62.30/32                                0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
4       100.66.250.99/32                               0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
4       100.64.0.10/32                                 0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
4       100.64.0.1/32                                  0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
4       100.0.0.0/8                                    0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
13      10.161.191.255/32                              0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
13      10.161.170.22/32                               0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
13      10.161.160.0/19                                0.0.0.0                                          256 15       ActiveStore                                                                                                                                                                                            
13      0.0.0.0/0                                      10.161.191.253                                     0 15       ActiveStore                                                                                                                                                                                            
25      ff00::/8                                       ::                                               256 15       ActiveStore                                                                                                                                                                                            
13      ff00::/8                                       ::                                               256 15       ActiveStore                                                                                                                                                                                            
4       ff00::/8                                       ::                                               256 15       ActiveStore                                                                                                                                                                                            
8       ff00::/8                                       ::                                               256 15       ActiveStore                                                                                                                                                                                            
1       ff00::/8                                       ::                                               256 75       ActiveStore                                                                                                                                                                                            
8       fe80::7542:3120:8fbd:fd36/128                  ::                                               256 15       ActiveStore                                                                                                                                                                                            
25      fe80::61de:7d5a:1724:fee3/128                  ::                                               256 15       ActiveStore                                                                                                                                                                                            
4       fe80::3d68:4a12:a302:4084/128                  ::                                               256 15       ActiveStore                                                                                                                                                                                            
13      fe80::2c28:2de8:3280:b7a5/128                  ::                                               256 15       ActiveStore                                                                                                                                                                                            
25      fe80::/64                                      ::                                               256 15       ActiveStore                                                                                                                                                                                            
13      fe80::/64                                      ::                                               256 15       ActiveStore                                                                                                                                                                                            
4       fe80::/64                                      ::                                               256 15       ActiveStore                                                                                                                                                                                            
8       fe80::/64                                      ::                                               256 15       ActiveStore                                                                                                                                                                                            
25      fd01:0:101:2613:61de:7d5a:1724:fee3/128        ::                                               256 15       ActiveStore                                                                                                                                                                                            
13      fd01:0:101:2613:2c28:2de8:3280:b7a5/128        ::                                               256 15       ActiveStore                                                                                                                                                                                            
13      fd01:0:101:2613:0:a:0:a2b/128                  ::                                               256 15       ActiveStore                                                                                                                                                                                            
25      fd01:0:101:2613::/64                           ::                                               256 15       ActiveStore                                                                                                                                                                                            
13      fd01:0:101:2613::/64                           ::                                               256 15       ActiveStore                                                                                                                                                                                            
1       ::1/128                                        ::                                               256 75       ActiveStore                                                                                                                                                                                            
13      ::/0                                           fe80::2613:191:253                               256 15       ActiveStore 

just as a reference

So my suggestion here is: we should support an way to override Antrea automatically trying to find the default gateway :)

What if we verify if some sort of configuration exists (env var, as suggested by @jayunit100 ) or whatever that defines "hey, this is your public interface, which is different from your kubernetes node ip address because you have two net interfaces: one for management/controlplane comms and other for public traffic) and if that's defined, allows Antrea to find the right interface/gateway with that.

Example:

  • mgmt network - 192.168.0.0/24 // kubernetes node ip: 192.168.0.20/24
  • public network - 10.0.10.0/24 // node address (not reported to kubernetes): 10.0.10.20/24

If no override option exists, the IP 192.168.0.20 would be picked, which is wrong in the sense that this IP address does not have a default gateway.

But if we allow that to be overwritten, and say "PUBLICNET=10.0.10.0/24" it may have an internal detection logic that says "ok, so my default gateway is 10.0.10.1 which is the gateway from my interface 10.0.10.20/24, which is inside my CIDR of 10.0.10.0/24"

@jayunit100 wdyt?

yeah, i agree, being able to override the default gateway is the thing that matters.

@jayunit100 @rikatz Thanks for catching the issue and giving suggestions.

For Antrea Windows, we always assume there is only one interface on the Node, which is configured with NodeIP (used by kubelet) to comunnicate in the cluster and to access external addresses. Then we could add the interface on OVS bridge to work as the uplink, and we expected Pod-to-external traffic could leave the host from that interface. The default gateway could avoid losing configurations when migrating the interface to OVS bridge.

If the default gateway is configured on a different interface from the OVS uplink, I don't think Antrea could satisfy the requirement currently. First, the original routing entry on the default gateway should still exist on the Windows host after OVS uplink interface is configured, because the configuration on that interface is never changed. If Anter adds the routing configuration, it should break the host networking configuration(management traffic might be forwarded from a wrong interface). Second, Antrea would use the $NODE_IP (configured on kubelet) to perform SNAT on OVS bridge for the Pod-to-external traffic, after the packet leaves OVS, it will finally leave the Windows host from the "management" interface according to the routing configuration. But for the reply packet which enters Windows host from management interface, might be not able to be back to OVS to perform de-SNAT without additional configurations. As a result, the client Pod is not able to receive the packet.

We might need more time to design to support the scenario that multi-interfaces exist on the Windows host, and the interfaces for Pod-toPod and Pod-to-external traffic are different.

Thanks @wenyingd ! So... can you clarify what you mean by "losing configurations while migrating the interface"

Thanks @wenyingd ! So... can you clarify what you mean by "losing configurations while migrating the interface"

To support Windows containers, OVS is working as an extension of HNS Network. Antrea uses "Transparent" HNS Network, and add the physical interface to OVS bridge as the uplink. During this operations, we have removed the L3 configurations from the physical interface, and added them to OVS bridge (br-int). So there should be a short time the existing L3 configurations are "lost" on the host , and then back after they are re-configured on the OVS bridge.

ah ok ! so you are saying, its totally possible to add this functionality as long as we allow for L3 configuration to be down for a few moments ? That is an acceptable comprimise for our purposes....

if so, can we make a branch of antrea that exhibits this behaviour ? kubernetes-sigs/sig-windows-dev-tools#46 depends on this, and it will allow us to continue using antrea as the default CNI on windows development tooling for upstream Kubernetes :)

tnqn commented

I think @jayunit100 @rikatz 's requirement makes sense.
@wenyingd @lzhecheng if we don't assume the default route must exist on the interface that has nodeIP configured, can everything just work?
For example, in above case:

mgmt network - 192.168.0.0/24 // kubernetes node ip: 192.168.0.20/24
public network - 10.0.10.0/24 // node address (not reported to kubernetes): 10.0.10.20/24
  1. Can we attach the public network device to OVS bridge and use it as uplink and do SNAT using its IP?
  2. Is there any difference between moving IPs/Routes from mgmt network device to OVS management device and from public network device to OVS management device?
  3. What's more, is it possible to not assume there is default route? Can we just move all existing routes to OVS mangement device? If there's no default route, don't create one.
  4. Maybe we can discover the uplink device in this order: user specified in configmap -> the device that has default route -> the device that has node IP.
  1. Can we attach the public network device to OVS bridge and use it as uplink and do SNAT using its IP?

I don't think we can support to use the management interface as the 'only' uplink, because we need to use the Node IP (used by kubelet) to perform tunnel in Encap mode. But what if two uplink interfaces? I am not sure about it, let me have a try.

  1. Is there any difference between moving IPs/Routes from mgmt network device to OVS management device and from public network device to OVS management device?

For OVS operations, they should be the same. The key point is how to forward the "public traffic" from OVS management device to the public network device which is really connected to the external world. If there are two uplink interfaces, it should be easy to implement. So the precondition is still if or not we could support multiple uplink interfaces on OVS.

  1. What's more, is it possible to not assume there is default route? Can we just move all existing routes to OVS mangement device? If there's no default route, don't create one.

Yes, we could not assume a default route must exist. Then Antrea should support to perform the defined routing entries on OVS using OpenFlow.

Great thank you. I will test the patch if available !

  • is there a short term workaround I can do ? I'm ok with pods not being able to access the internet .

Having offline discussion with @tnqn , it is valuable to support multical interfaces on Windows and let user decide which interface is used as OVS uplink. In this case, no default gateway configuration on the uplink interface is allowed. One precondition is the chosen interface should be used for Pod traffic to access other Pods in the cluster and to access external addresses, and Antrea would use it to set up tunnel in Encap mode. Besides, if the chosen interface is different from the one that kubelet is using, a risk is the ClusterIP/NodePort Service whose backend endpoint is a hostNetwork Pod, because the IP of the Pod is not managed by OVS.

I have made some expriments, it should work when Pod accesses NodeIP if Antrea also performs SNAT on Windows host (not only in OVS). As for NodePort Service, Antrea is using kube-proxy to provide NodePort Service, it should work. As for AnteraProxy, additional work is needed for Antrea to redirect the NodePort Service packets to OVS.

Code change is needed to support user choose interfaces as uplink on the Windows host.

ok, to clarify:

  • One precondition: When you say should, do you mean must ?
  • Encap mode: Don't we already encap all traffic w/ antrea by default ?
  • What if the interface for the node is on a private network which doesn't have a default gateway ? how does OVS find the right interface for pod vs external traffic ? that is the case in private_ip with vagrant, which is what we see in the https://github.com/kubernetes-sigs/sig-windows-dev-tools/ project

sorry if my questions are redundant :) not an OVS expert :)

Thank you so much for looking at this so quickly!

  • One precondition: When you say should, do you mean must ?

Correct, I mean 'must'.

  • Encap mode: Don't we already encap all traffic w/ antrea by default ?

In Encap mode, for the Pod-to-Pod traffic (no matter the target Pod is the direct destination or behind a Service), we encap the packet if the destination Pod is on a different Node. But we don't encap the Pod-to-external traffic.

  • What if the interface for the node is on a private network which doesn't have a default gateway ?

If there is no default gateway on the chosen interface, and the packet destination is not known (that should be external traffic), we plan to let Windows host to perform SNAT. Some additional configuration on Windows host is needed, and we will add it.

! ok thanks !
one more question, just curious... by "addtional configuration on Windows host" , does that mean

  • adding OVS rules or
  • adding special Powershell / netsh commands ?

! ok thanks !
one more question, just curious... by "addtional configuration on Windows host" , does that mean

  • adding OVS rules or
  • adding special Powershell / netsh commands ?

I mean add special Powershell commands

hm I tried compiling the binary from your branch here https://storage.googleapis.com/jayunit100/antrea-agent-if-ps-fixes.exe but seems to not run for me. maybe a golfing windows compilation issue on OS X?

Do you have an agent.exe I can run?