google/nftables

`Conn` isn't a connection, but a network namespace reference; could it please getting refactored into connection with Open/Dial and Close?

thediveo opened this issue · 6 comments

At this time, Conn is something resembling more of an address, or network namespace address (reference) to be more precise. Additionally, it has an exported(!) mutex factored in which is kind of strange it it should be unexported?

type Conn struct {
	TestDial nltest.Func // for testing only; passed to nltest.Dial
	NetNS    int         // Network namespace netlink will interact with.
	sync.Mutex
	// contains filtered or unexported fields
}

This current design assumedly comes from its current most prominent use case: set up nftable changes and then apply (flush) them. In consequence, the connection-disguised netns address is only opened temporarily for each individual operation on nftables, such as a flush of pending changing.

In consequence, discovering the current state of nftables is highly inefficient, as it requires a large amount of NETLINK socket opening and closing in rapid repeated sequence. To properly discover the current nftables setup, such as for running diagnosis tools on it, the following steps are required, resulting in constant socket opening and closing for each individual (repeated) step:

  1. conn.ListTables() -- in order to get table properties not returned implicitly by conn.ListChains()
  2. conn.ListChains()
  3. for each chain:
    1. conn.GetRule() -- misnomer, it actually works like GetRules()

My humble suggestion would be:

  • refactor Conn by adding Open and Close() receivers to allow for improved behavior. At least for some time these can be co-exist with the current implicit behavior when calling nftables operations like ListTables() without Open first.
    • Open(opts) could make use of the option (function) pattern to allow to custom dialing or even passing in a suitable fd opened by some other means, this would care for issue #132 (which I would like to see covered not least as I have my own namespace discovery engine module lxkns also covering switching namespaces and covering for many more cases than the well-known netns package).
    • Close() would be allowed to be called multiple times, but subject to the now-internal connection mutex as to not close the NETLINK socket while in flight.
  • make sync.Mutex non-exported -- unless I missed a reason for exporting it?
  • add GetRules() receiver and deprecate GetRule() for slow migration.

I'm willing to work on a PR given the signal that there is actual interest in guiding me through my foolish mistakes and finally integrating it.

Any things you can suggest already upfront?

refactor Conn by adding Open and Close() receivers to allow for improved behavior. At least for some time these can be co-exist with the current implicit behavior when calling nftables operations like ListTables() without Open first.

I don’t mind adding Open and Close methods (probably Open should just return a close func instead of a separate method), but existing operations need to keep working without an explicit Open, indefinitely (not just during migration).

  • make sync.Mutex non-exported -- unless I missed a reason for exporting it?

Sounds reasonable. I didn’t pay attention when it was added to the fact that it would be exported. There’s no need for the mutex to be exported, though.

  • add GetRules() receiver and deprecate GetRule() for slow migration.

Sounds reasonable, but seems completely unrelated to the other change :)

@stapelberg thank you very much for your feedback!

I don’t mind adding Open and Close methods (probably Open should just return a close func instead of a separate method), but existing operations need to keep working without an explicit Open, indefinitely (not just during migration).

Yes, this was exactly also my assumption and I have already code that keeps the existing behavior while taking advantage of an already existing "long-running" connection created by Open, if available.

I though a while about your idea of returning a closer function instead ... but I don't think it is convenient to pass around two values (object pointer and closer function) all the time. We don't do it for files and sockets neither. Fun fact: I had exactly the closer function pattern in mind and use it internally to allow the existing netlink operations such as GetTables to work either with a temporary/transient netlink connection as before, or otherwise with an already existing "long-running" connection opened with Open.

make sync.Mutex non-exported -- unless I missed a reason for exporting it?

Sounds reasonable. I didn’t pay attention when it was added to the fact that it would be exported. There’s no need for the mutex to be exported, though.

Good, done. And in fact, it is used only internally.

  • add GetRules() receiver and deprecate GetRule() for slow migration.

Sounds reasonable, but seems completely unrelated to the other change :)

Ah, you got me here! Yes, true 😁

I've drafted an additional test that checks that bothe the existing behavior is still correct, as well as when using Open the connection is properly used by the existing methods.

I still want to ponder a short time about allowing to use an existing netlink file descriptor but, 1st this is probably not really a use case, and 2nd the netlink library used here doesn't support such a usecase. But I think that options to Open for specifying the network namespace reference fd and maybe the test function will be of use to more people than just me...

I though a while about your idea of returning a closer function instead ... but I don't think it is convenient to pass around two values (object pointer and closer function) all the time.

Another alternative would be to have Open() return a different handle, say PermanentConn, which has the same methods as Conn plus an additional Close() method? On a technical level, it could be:

type PermanentConn struct {
  Conn
  closeOnce sync.Once
}

That way, we can have Close be idempotent, i.e. do nothing after the first successful Close. That way, you can never have bugs related to calling Close too often, which could break other usages of the nftables package within the same program.

Maybe this is better to discuss over an actual pull request, though. If you have a clean way to implement this without accidental Close calls causing havoc, that’s all I’m looking for :)

But I think that options to Open for specifying the network namespace reference fd and maybe the test function will be of use to more people than just me...

We should probably use an option struct for the Open call. The number of options we might want to add could grow in the future.

Oh, might we have different ideas of open as a method (receiver) versus as a constructor? I had somehow assumed Open to be a method (receiver) but do you would like it to be a constructor in the same way as os.Open returns a *File?

At the moment I don't see how the separation of PermanentConn versus Conn could work: the particular nftable operation methods need an nftable connection, either one created on the fly, or the permanent one from Open. But since Go doesn't feature structural inheritance and the individual nftable operation methods encapsulate getting a connection and this getter is sealed to belong to Conn, it cannot be overriden in PermanentConn. I'm at a loss here design-wise, except for baking all into Conn. 2hat am I overlooking?

As for the options I had rather option functions in mind instead of an ever growing exported option struct.

I will rework my draft code accordingly so you can better discuss it.

Here’s what I had in mind: https://go.dev/play/p/a7iD9tGpCl5

Your approach of Open() as a constructor works too; left a comment on the PR regarding the details

This is done 😀