jpetazzo/pipework

Issues with assignment to physical interface

Closed this issue · 9 comments

I am having issues with pipework in regards to connecting containers to a physical interface.

In this example case, I have a VMware VM with two eth interfaces (eth0, eth1).

eth0      Link encap:Ethernet  HWaddr 00:0c:29:ec:d4:2b  
          inet addr:172.16.35.133  Bcast:172.16.35.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:feec:d42b/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2081 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1617 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:171216 (171.2 KB)  TX bytes:263210 (263.2 KB)

eth1      Link encap:Ethernet  HWaddr 00:0c:29:ec:d4:35  
          inet addr:192.168.102.212  Bcast:192.168.102.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:feec:d435/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:182600 errors:0 dropped:0 overruns:0 frame:0
          TX packets:71522 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:273477200 (273.4 MB)  TX bytes:4328088 (4.3 MB)

If, for example, I do:

pipework eth1 94e1fe92ddac 192.168.102.50/24

The container gets the 192.168.102.50 IP and can ping itself at that address, but the host cannot ping the container at that address. Needless to say, probably, the actual intended use doesn't work either: The container cannot be pinged from another host on 192.168.102.0/24.

I can provide more details or do further tests if this is inconclusive, but if there's already something obvious in what I have described above, I'd love to hear it.

Just to clarify one point from the example above:

If I have two containers, say:

pipework eth1 94e1fe92ddac 192.168.102.50/24
pipework eth1 2235bc9fb953 192.168.102.51/24

The two containers CAN ping each other on these addresses, but again, the host cannot ping either, nor can an external host ping either via eth1.

Another test:

Two VMs. Each one has one eth interface (eth0).

VM#1 (docker host) sits on eth0 172.19.2.117/24
VM#2 (test vm) sits on eth0 172.19.2.20/24

These VMs can ping each other.

On VM#1, I move the IP to a bridge:

brctl addbr br1
ifconfig eth0 0.0.0.0 promisc up
brctl addif br1 eth1
ifconfig br1 up
ifconfig br1 172.19.2.117/24
route add default gw 172.19.2.1

The two VMs can still ping each other across the bridge.

Now we run a docker container on VM#1.

Then we give it IP 172.19.2.181 on its eth1 interface using pipework:

pipework br1 8c2991eb8bd3 172.19.2.181/24

The container can ping VM#1 at 172.19.2.117.
VM#1 can ping the container at 172.19.2.181.

But the container can NOT ping VM#2 at 172.19.2.20.
And VM#2 can NOT ping the container at 172.19.2.181.

If I look at tcpdump -i br1 on VM#1 while pings from the container to VM#2 are failing, I see that ARP replies are not to found here:

00:17:25.902297 ARP, Request who-has 172.19.2.20 tell 172.19.2.181, length 28
00:17:26.900932 ARP, Request who-has 172.19.2.20 tell 172.19.2.181, length 28
00:17:27.900920 ARP, Request who-has 172.19.2.20 tell 172.19.2.181, length 28
00:17:28.918453 ARP, Request who-has 172.19.2.20 tell 172.19.2.181, length 28
00:17:29.916919 ARP, Request who-has 172.19.2.20 tell 172.19.2.181, length 28
00:17:30.916932 ARP, Request who-has 172.19.2.20 tell 172.19.2.181, length 28
00:17:31.934800 ARP, Request who-has 172.19.2.20 tell 172.19.2.181, length 28
00:17:32.932924 ARP, Request who-has 172.19.2.20 tell 172.19.2.181, length 28
00:17:33.932922 ARP, Request who-has 172.19.2.20 tell 172.19.2.181, length 28
00:17:34.950279 ARP, Request who-has 172.19.2.20 tell 172.19.2.181, length 28

But if I ping VM#2 (172.19.2.20) from VM#1 itself (172.19.2.117), these pings succeed as I said before, and tcpdump does show ARP replies for this (obviously, since it works):

00:12:57.316707 ARP, Request who-has 172.19.2.20 tell 172.19.2.117, length 28
00:12:57.316879 ARP, Reply 172.19.2.20 is-at 02:21:5a:f3:c1:7e (oui Unknown), length 28
00:12:57.316907 IP 172.19.2.117 > 172.19.2.20: ICMP echo request, id 2630, seq 1, length 64
00:12:57.317412 IP 172.19.2.20 > 172.19.2.117: ICMP echo reply, id 2630, seq 1, length 64
00:12:58.316961 IP 172.19.2.117 > 172.19.2.20: ICMP echo request, id 2630, seq 2, length 64
00:12:58.317304 IP 172.19.2.20 > 172.19.2.117: ICMP echo reply, id 2630, seq 2, length 64
00:12:59.316977 IP 172.19.2.117 > 172.19.2.20: ICMP echo request, id 2630, seq 3, length 64
00:12:59.317342 IP 172.19.2.20 > 172.19.2.117: ICMP echo reply, id 2630, seq 3, length 64

Thoughts on what's missing here?

What's the configuration of the host? How eth0 and eth1 are configured in VMWare?
Can you in each case dump the output of ip link, ip addr, brctl show? (On the host as well if it's a Linux machine.)

I posted about two different tests. The first was in VMware, and the second was in AWS.

Here is some output from VM#1 in the second test (172.19.2.117 in AWS):

ip link:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br1 state UP qlen 1000
    link/ether 02:89:38:20:b3:b0 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 02:2a:26:be:d2:01 brd ff:ff:ff:ff:ff:ff
4: lxcbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 42:54:48:7d:1b:b0 brd ff:ff:ff:ff:ff:ff
5: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether 96:e1:11:c2:85:d0 brd ff:ff:ff:ff:ff:ff
8: br1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether 02:89:38:20:b3:b0 brd ff:ff:ff:ff:ff:ff
12: vethPR0Ljj: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master docker0 state UP qlen 1000
    link/ether 96:e1:11:c2:85:d0 brd ff:ff:ff:ff:ff:ff
14: vethl2308: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br1 state UP qlen 1000
    link/ether 52:ad:cb:c5:dc:07 brd ff:ff:ff:ff:ff:ff

ip addr:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br1 state UP qlen 1000
    link/ether 02:89:38:20:b3:b0 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::89:38ff:fe20:b3b0/64 scope link 
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 02:2a:26:be:d2:01 brd ff:ff:ff:ff:ff:ff
4: lxcbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 42:54:48:7d:1b:b0 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.1/24 brd 10.0.3.255 scope global lxcbr0
    inet6 fe80::4054:48ff:fe7d:1bb0/64 scope link 
       valid_lft forever preferred_lft forever
5: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether 96:e1:11:c2:85:d0 brd ff:ff:ff:ff:ff:ff
    inet 172.17.42.1/16 scope global docker0
    inet6 fe80::146b:2fff:feac:ffc0/64 scope link 
       valid_lft forever preferred_lft forever
8: br1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether 02:89:38:20:b3:b0 brd ff:ff:ff:ff:ff:ff
    inet 172.19.2.117/24 brd 172.19.2.255 scope global br1
    inet6 fe80::8821:fcff:fe5b:b1bc/64 scope link 
       valid_lft forever preferred_lft forever
12: vethPR0Ljj: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master docker0 state UP qlen 1000
    link/ether 96:e1:11:c2:85:d0 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::94e1:11ff:fec2:85d0/64 scope link 
       valid_lft forever preferred_lft forever
14: vethl2308: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br1 state UP qlen 1000
    link/ether 52:ad:cb:c5:dc:07 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::50ad:cbff:fec5:dc07/64 scope link 
       valid_lft forever preferred_lft forever

brctl show:

bridge name bridge id       STP enabled interfaces
br1     8000.02893820b3b0   no      eth0
                            vethl2308
docker0     8000.96e111c285d0   no      vethPR0Ljj
lxcbr0      8000.000000000000   no      

Here is output from the container in the second test (172.19.2.181):

ip link:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
11: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether ba:96:ea:3e:9d:74 brd ff:ff:ff:ff:ff:ff
13: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether e6:9f:d8:63:0f:6d brd ff:ff:ff:ff:ff:ff

ip addr:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
11: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether ba:96:ea:3e:9d:74 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0
    inet6 fe80::b896:eaff:fe3e:9d74/64 scope link 
       valid_lft forever preferred_lft forever
13: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether e6:9f:d8:63:0f:6d brd ff:ff:ff:ff:ff:ff
    inet 172.19.2.181/24 scope global eth1
    inet6 fe80::e49f:d8ff:fe63:f6d/64 scope link 
       valid_lft forever preferred_lft forever

Thanks for any insight...

If I understand correctly, you have bridged your containers with your eth1 interface.
This means that the containers are talking directly on AWS VPC private network.
However, I don't know if AWS will forward Ethernet frames with unknown MAC addresses.
Is there any element in the AWS documentation which suggests that it should work?
I remember that VPC has an option to allow foreign IP addresses (on a per-VM basis), but I don't remember anything about foreign MAC addresses...

I believe you are correct. There is a feature in AWS called "disable source/destination checking", but that is indeed Layer 3, and it is probably true that the AWS "ethernet" doesn't like frames that it doesn't know anything about in regards to MAC address.

Now, that said, I also also having issues with this in VMware, which should have worked. Let me get back on this with more specifics, in case it is interesting and/or helpful.

Okay!

So, to recap:

  • bridging containers directly on AWS didn't work, but that's expected—it is because of a deficiency (or lacking feature) in AWS
  • bridging containers directly on VMWare should have worked, though.

Is it OK if I close this issue, and if you have more details on the VMWare issue, we can open a new (more specific) one?

Correct on both points. Let's do close this, and I will retest and open a new, more specific issue.

Okay!