vi/dive

[Feature Request] Option to estabilish an inteval in seconds between each --setns operation

Closed this issue · 14 comments

@vi

See this example:

# /dev/shm/dive-master/dived -J --setns /var/run/netns/foo --setns /proc/1/ns/net -- nc -v -l 0.0.0.0 1234


# netstat -nlp | grep 1234
tcp        0      0 0.0.0.0:1234            0.0.0.0:*               LISTEN      404351/nc         
# ip netns exec foo netstat -nlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
Active UNIX domain sockets (only servers)
Proto RefCnt Flags       Type       State         I-Node   PID/Program name     Path

This will set the application in netns "foo" and will move from "foo" to main host netns (PID #1).

But the --setns operations are done immediately, it would be reasonable that an option to specify an interval between each --setns be implemented.

There is a practical usage for this idea: Make a program listen inside a netns, and send the outgoing connection over another, as all the opened sockets aren't affected by a setns system call.

The program will be placed inside the first netns and will create a listening socket inside it with the first --setns invocation, the second one will call setns() and move the program to second netns. The program will accept connections in the listening socket inside the first netns and will make connections inside the second one.

Very useful for setting transparent proxies inside network namespaces, LXC or Docker. With Docker and LXC, all that must be done is make the unix socket of dived accessible in some way inside the container.

Any thoughts?

vi commented
  1. Namespaces are typically configured prior to starting a program. Forcing namespace change on already started program can probably be done with debug tools, but I expect it not to be a good idea.
  2. Relying on timing (assuming that listening socket gets opened before the interval expires) is a race condition and may be unreliable, especially when system is slow/out of memory.

If the program supports inheriting listening socket (e.g. for SystemD service integration), it may be better to use that to open socket in one namespace, but use it in another namespace. This way the program would accept connections from a foreign namespace.

Sockets can also be transferred between namespaces explicitly. Example: https://github.com/vi/netns_tcp_bridge/ .

Ok, I got.

Looking at netns_tcp_bridge, a application wrapper could be created, it would create a child process in a netns, and by intercepting bind()/connect() could pass to the child process. But for working with Go executables or static ones, it would require something like ptrace() to intercept the syscalls;

vi commented

I expect there to be easier way, especially if the program can be modified/recompiled.

If yes, something like "systemd socket activation" can be implemented/integrated. Instead of systemd, you can supply the socket from your own program (which retrieves pre-bound pre-listened socket from another namespace).

If not, I expect Linux networking routing system to be flexible enough to simulate what is needed without needing to interfere inside userspace. You may look up IP_TRANSPARENT socket option. Something like netns_tcp_bridge, but with additions for transparent proxying may probably be created.

If yes, something like "systemd socket activation" can be implemented/integrated. Instead of systemd, you can supply the socket from your own program (which retrieves pre-bound pre-listened socket from another namespace).

I found this project: https://github.com/mitsuhiko/systemfd

But I don't know if can be used for the purpose I want.

vi commented

Looks like systemd-socket-activate with more features.

You probably need something that knows about network namespaces. This can be a modified systemfd that, for example, uses setns after setting up listening the socket, before spawning processes.

Obviously, the program you use must support socket passing protocol used by systemd / systemfd / systemd-socket-activate.

Just for curiosity, does the --setns option moves the program to a specified netns before or after setting the user in the wrapped program with the option --user?

vi commented

dived does not set namespace or user for a wrapped program. It sets namespace or user for itself, then substitutes itself with the program you have chosen.

Namespaces are done first and user is set later, as we may lose ability to adjust namespaces after already going away from root.

Got.

I asked that because I thought it would be possible to create a "socketat" LD_PRELOAD library for moving processes over network namespaces when socket syscall are created: https://lore.kernel.org/lkml/m1bp7oq1u8.fsf@fess.ebiederm.org/
It is possible to do that, the the preloaded app should run as root, that sucks.

vi commented

LD_PRELOAD approach may work. I'm not sure which syscall do you need to hook. Maybe it's not creation of the socket, but bind or listen.

It is a somewhat typical feature of programs that listen sockets to drop privileges after listening the socket.

This library already exists, I've had to do some modification on it: https://pastebin.com/5B5Cmrmw

It intercepts bind(*) and move to a netns, but it works only for programs running as root, that's why I don't use it.

But a friend of mine suggested an approach that would work:

A LD_PRELOAD library intercepts socket(*) and pass via an Unix Domain Socket to an daemon that handles the socket creation, the daemon should receive all the information about the socket, the daemon creates the socket and pass it back to the process by just using SCM_RIGHTS, the daemon could run in any netns, or an even better idea: it should run as root and selects automatically the desired netns, by just calling setns(*) for each socket.
The only part I don't understand is the original file descriptor handling, but I will talk again with him for detailing that idea.

vi commented

Yes, LD_PRELOAD + SCM_RIGHTS should work, but it is a complex solution with multiple moving parts. It will be like my netns_tcp_bridge, but embedded into the program using LD_PRELOAD.

I don't understand is the original file descriptor handling

What do you mean by "original file descriptor"?

What do you mean by "original file descriptor"?

I was told about that, that it would be needed to duplicate the socket over the file descriptor when passing over Unix Socket, but I didn't understand very well. With pure socketat is perfectly possible to move sockets across netns, but I don't like a bit because root is mandatory.

vi commented

duplicate the socket over the file descriptor when passing over Unix Socket

Typically you just close the extra file descriptor after sending a copy of it over the unix socket.

Typically you just close the extra file descriptor after sending a copy of it over the unix socket.

Yep, I understand perfectly what you said, but the mechanism of passing .FDs I'm still studying.
Thanks,