[new API] two-phased qb_ipcs_create required

Question

[new API] two-phased qb_ipcs_create required

jnpkrn opened this issue 6 years ago · 8 comments

Signatures

/* Creates a listening UNIX socket, akin to qb_ipcs_create */
qb_ipcs_listener_t* qb_ipcs_listener(const char *name);

/* Gets socket descriptor from listener;  to prevent further
   use (the same information is now at two places at minimum),
   deallocates l, so _it must not be reused elsewhere from that
   point on_ */
int qb_ipcs_listener_fd(const qb_ipcs_listener_t *l);

/* Wraps a listening UNIX socket, presumably created with
   qb_ipcs_listener and subsequently extracted with qb_ipcs_listener_fd
   (e.g. the name gets validated), extracting "name" using
   getsockname(2);  _fd must not be reused elsewhere from that point on_ */
qb_ipcs_listener_t* qb_ipcs_listener_from_fd(int fd);

/* Reuses existing UNIX socket wrapped in a listener object;
   qb_ipcs_listener + qb_ipcs_create_from_listener equals qb_ipcs_create;
   to prevent further use, deallocates l, so _it must not be reused
   elsewhere from that point on_ */
qb_ipcs_service_t* qb_ipcs_create_from_listener(qb_ipcs_listener_t *l,
                                                int32_t service_id,
                                                enum qb_ipc_type type,
                                                struct qb_ipcs_service_handlers *handlers);

Allow safe nesting wrt. NULL checking, e.g.,

qb_ipcs_create_from_listener(qb_ipcs_listener_from_fd(qb_ipcs_listener_fd(qb_ipcs_listener("foo"))),
                             42, QB_IPC_SHM, handlers);

The qb_ipcs_listener_t abstraction is meant to allow for (and enforce)
extra checking on libqb side (so that the same internal coherency as with
monolithic qb_ipcs_create is achieved), as that type is only used as
an opaque pointer in the client program.

Answer 1 · 2018-09-26T19:31:10.000Z

Note that this would also (finally, one wants to say) offer client
programs to unset close-on-exec flag for cases it would be counter
productive (what if pre-fork parent establishes the listening socket
only to be picked, with qb_ipcs_listener_from_fd, by an after-fork
child with a brand new process image that was just exec'd?
btw. this looks exactly like a sequence every forking daemon
utilizing libqb's IPC subsystem should exercise unless parent-child
synchronization is in place -- it's highly undesirable for daemon
launcher process to indicate successful startup when the forked off
executive part fail on already occupied IPC channel the moment later!).

I'm currently split whether it should be a QB branded flag that can
be passed to qb_ipcs_listener() for convenience or whether authors
of client programs should be kept on their own with this.

Nonetheless, it must be documented that normally, this socket is
preconfigured with O_NONBLOCK and FD_CLOEXEC flags (respective to
particular fcntl command). And that only FD_CLOEXEC can reasonably
be mangled with, everything else asks for conflicting with intended
state of affairs on libqb side (but then, qb_ipcs_listener_from_fd()
can check or reset some vital flags on its own).

Answer 2 · 2018-09-27T07:18:36.000Z

API looks sane, but what is the use case?

Also I was unable to understand following part:

btw. this looks exactly like a sequence every forking daemon
utilizing libqb's IPC subsystem should exercise unless parent-child
synchronization is in place -- it's highly undesirable for daemon
launcher process to indicate successful startup when the forked off
executive part fail on already occupied IPC channel the moment later!).

If it is about daemonization process, then why should daemon do the exec after fork?

I can imagine situation with child processes, but sharing fd is just asking for a big troubles.

Answer 3 · 2018-09-27T12:08:50.000Z

On 27/09/18 00:18 -0700, Jan Friesse wrote: If it is about daemonization process, then why should daemon do the exec after fork?

In case it's a hierarchical daemon of daemons (e.g. pacemaker).

I can imagine situation with child processes, but sharing fd is just asking for a big troubles.

Why? This is the same mechanism that, e.g., systemd uses to share the notification socket into services accommodating that signaling scheme.

…

-- Jan (Poki)

Answer 4 · 2018-10-01T06:23:41.000Z

On 27/09/18 00:18 -0700, Jan Friesse wrote: If it is about daemonization process, then why should daemon do the exec after fork?
In case it's a hierarchical daemon of daemons (e.g. pacemaker).

So it's not about daemonization.

I can imagine situation with child processes, but sharing fd is just asking for a big troubles.
Why? This is the same mechanism that, e.g., systemd uses to share the notification socket into services accommodating that signaling scheme.

Because current libqb IPC is not prepared for concurrent access.

Could you please explain the intended use case?

…
-- Jan (Poki)

Answer 5 · 2018-10-03T10:06:40.000Z

On 30/09/18 23:23 -0700, Jan Friesse wrote: > On 27/09/18 00:18 -0700, Jan Friesse wrote: >> If it is about daemonization process, then why should daemon do the exec after fork? > In case it's a hierarchical daemon of daemons (e.g. pacemaker). So it's not about daemonization.

In part, it is. Well behaved daemonizing daemon will either synchronize with the effective service, the child process, to indicate any initialization failures early back to the launching environment truthfully, or, less optimally perhaps, will perform some of these initializations on its own, only to pass the successfully opened IO resources to the child. For IPC listening parts based on libqb, the latter wasn't possible thus far, but the proposed extension will enable it. But it may also come handy in case the superdaemon spawns sub-daemons -- with this extension, this superdaemon can do some substantial initializations on its own, only to share this prepared environment with the spawned sub-daemons, partially eliminating risk of their later failure (i.e., fail early principle).

> I can imagine situation with child processes, but sharing fd is > just asking for a big troubles. Why? This is the same mechanism > that, e.g., systemd uses to share the notification socket into > services accommodating that signaling scheme. Because current libqb IPC is not prepared for concurrent access.

Well, how does that differ with plain descriptors treatment wrt. read/write syscalls, for instance? It still remains the responsibility of the user of these interfaces to preserve sanity, e.g. prevent multiple readers that would mutually corrupt their view on the IO stream. System-wide, there's only CLOEXEC flag that's meant to prevent misuses in one direction, but here we mean to propagate descriptor in the opposite direction and there's nothing to prevent possible misuse (sharing side will reuse the descriptor even if it's meant to be fully acquired by its child) in a portable way -- it's up to the interface user to behave wisely, as usual in Unix tradition, so I fail to see any regression in this respect.

…

-- Jan (Poki)

Answer 6 · 2018-10-03T11:26:22.000Z

Ok, now it makes a little more sense. I have doubts about the solution really solves any real problem (especially when daemon is systemd enabled and notifies its state), but as long as old api is kept, then why not.

Answer 7 · 2018-10-03T12:19:53.000Z

Yes, it indeed does help with real use cases, details forthcoming.
This is not a hypothetical exercise if you think it is.

Answer 8 · 2019-06-18T17:07:10.000Z

Have discovered that ability to force non-abstract sockets where they'd
otherwise get used (#248) -- likely together with pre-existing cases
where that had been already the case -- will break client programs that
run as non-root (making me wonder if that change was ever tested in
cluster stack context).

The main problem is that /var/run is intentionally not world-writable,
meaning that bind(3) triggered with unprivileged process so as to lay
the socket there will effectively fail with EACCES or similar
-- that's across the board of the OSes, all-permissive arrangement
yet to be spotted.

For pacemaker in particular, the solution is as "simple" as adopting
the solution per this request, as envisioned for other reasons already.
The master daemon of pacemaker (running as root, naturally) can then
get such sockets set up, especially for its non-root children, meaning
this particular problem would get solved along.

Any timeframe we can expect this?