rockstor/rockstor-core

[Rock-ons] Host networking is not accounted for

Opened this issue ยท 6 comments

We have several rock-ons that require host networking for some of their features. Unfortunately, we currently do not account for the particularities that host networking entails, which can lead to some awkwardness in user experience. A practical example can be found in the following pull request in our rockon-registry repository:
rockstor/rockon-registry#227 (comment)

The main issue is that docker (logically) ignores ports publishing when a container uses host networking, while Rockstor does not, therefore creating the potential for inaccurate port mapping information in the database. It is therefore proposed to follow docker's behavior and ignore port mapping when a rock-on uses host networking. An indirect consequence, however, is that without port mapping, no webUI link button would be generated for such rock-on. It would thus be necessary to detect the webUI port in such instance(s) (if any) and generate the webUI link button accordingly.

Note that substantial work on docker networking is pending (#2207), which includes some ground work for detection of host networking on running containers. While this won't unfortunately help at rock-on install time, it will provide some foundation for the resolution of this issue. It might therefore be preferable to wait for #2207 availability before working on the current issue.

This issue is very similar to #1208. Both issues should thus be fixed by the same pull-request.

@phillxnet, I currently see a couple of ways to tackle this:

  1. keep all db structures the same, and add an additional check for host-networking when fetching the UI port
  2. alter db structure and write new JSON parsing logic to detect the presence of the --net=host option.

Option 1

Advantages

  • quick and easy implementation: a couple of lines in the RockOn model:
   @property
    def ui_port(self):
        if not self.ui:
            return None
        for co in self.dcontainer_set.all():
            for po in co.dport_set.all():
                if po.uiport:
## BEGIN new section
                    if not self.host_network:
                        return po.hostp
                    else:
                        logger.debug("Host network detected as True")
                        return po.containerp
## END new section
        return None
  • no changes in current database structure, so no need for migration

Inconveniences

  • we don't have a way to detect whether the rock-on uses host-networking during the rock-on install wizard, so the user is still offered with the PortChoice dialog. We simply ignore whatever the user is entering and build the "webUI" link from the database info.
  • it adds another function call in the RockOn model, with thus a slight performance cost (I don't know how significant it is).

Option 2

Advantages

  1. We could simplify our current rockon networking back-end as we don't need to trigger calls to docker ps (probe_containers()). A portion of our recent additions would thus be redundant and unnecessary (so could be simplified).
  2. We could detect whether a rock-on uses host-networking during the rock-on install wizard, which means we can skip the PortChoice dialog.

Inconveniences

  • A lot more refactoring would be needed throughout the rock-on backend and frontend.
  • Database migrations needed
  • Part of this refactoring would include the addition of logic to search for the --net=host option and fill/fetch the database accordingly, which would itself have a cost. The latter might be occurring only rock-ons are updated, however, so not occurring very often.

I'm thus unsure what the best approach would be in terms of performances... I'm becoming increasingly conscientious of our ever-growing list of rock-ons so a simple addition could have substantial performance costs when it needs to be repeated for each and every container.

Of course, we can always implement option 1 first and then work on option 2 in the near future if that actually helps with performances and robustness.

What would you think?

@FroggyFlox

What would you think?

I'm favouring option 2 actually. Our rock-ons capabilities have grown significantly of late, as you well know given you where the one doing this overhal. And your first options looks to leave us a little blind with regard to installs and having to probe/call docker on every model instantiation, in addition to what we already do seems like a move in the wrong direction. Especially given the docker-file indicates this up-front and for a one of cost of parsing this during rock-on db setup seems like a far more advantageous approach that will hopefully not be too costly on the refactoring you mention would be required.

And given we have pans to further extend the rock-ons model to accomodate for architecture awareness I think we shouldn't shy away from additional db migrations. Especially given their cost of lookup is, I'm assuming, far cheaper than standing up a shell and running a docker call within that shell and processing/parsing the output, and for that to be done on every model instantiation may well be over-the-top in an effort to avoid lower level changes (within the db and management code) that more directly address our required parameter 'knowledge' that is explicitly defined up-front in the Rock-on definition for the exact purpose of preparing the system in how to handle it. That seems more like a bottom up approach rather than reverse engineering the end result from the docker probe post install.

However you are far more familiar with this code than myself and so it's your call. And if you think the associated rework needed for 2 is not appropriate for the milestone this issue is under then we re-evaluate that rather than half doing it the less efficient way only to then have to revisit and then deal with potentially even more back compatibility.

Hope that makes some sense and I'm not just talking gibberish here :). I have in the past been overly averse to db migrations only to have it bite me back with additional baggage created in trying to deal with a problem at too higher level. I know we have to read system state as canonical but in the case of a Rock-on, it's definition shouldn't change from under us. And so reading it's requirements into the db and using those as canonical looks to be the way to go. Especially if it's more efficient in the long run. And we do have more to add just yet such as the mentioned Arch awareness so we might as well get adding what we need db wise as we go.

In short option 2 looks to be more bottom up rather than top down and given we are defining the bottom we might as well take advantage of it rather than adding more reverse engineering. But again you are far more familiar with this code, and it may well be that the exact nature of the problem will only become apparent once the coding work is underway.

2. We could detect whether a rock-on uses host-networking during the rock-on install wizard

This seems like a key facility actually. And one that could be extended to the Arch awareness. I.e. we can presumably flag all Rock-ons pre-install that represent a reduced security standing, or are incompatible with the running architecture.

Apologies if I have this all about face.But the more we can establish up-front within the db the better I say, as long as it doesn't incure major management overheads. Such as system state that can easily fall out of sync. But in this case a Rock-on definition is up-front and known; and relatively static. And so should be a good candidate to be mirrored in the db and only updated upon re-install for example.

Thanks a lot for all the feedback, @phillxnet. I agree with you that option 2 seems to be the wiser choice. First things first, though is the following:

And if you think the associated rework needed for 2 is not appropriate for the milestone this issue is under then we re-evaluate that rather than half doing it the less efficient way only to then have to revisit and then deal with potentially even more back compatibility.

I feel we might want to remove this issue and #1208 for the next milestone (this issue and #1208 would be fixed by the same PR if we go with option 2 rather than option 1), indeed. I see it that way for the following reasons:

  • the amount of work needed for option 2 might actually take a little more time than we'd like so I feel it shouldn't delay the next ISO release.
  • while we wanted to get the rocknet work in now rather than letting it wait for a bit more, attaching this issue and #1208 to the same "boat" could be considered a scope creep as it's not entirely related. We've indeed always had this issue's and #1208's problems in the code, long before the introduction of rocknets.

If I can manage to get that fix sooner than I expect, then great, but I'm not sure this issue and #1208 should be considered as blocking. Of course it's all up to you. I already have an idea on how to tackle this, but this post of mine is deliberately overly cautious in case there's a bigger problem I'm currently missing that would make the work needed greater.

And your first options looks to leave us a little blind with regard to installs and having to probe/call docker on every model instantiation, in addition to what we already do seems like a move in the wrong direction. Especially given the docker-file indicates this up-front and for a one of cost of parsing this during rock-on db setup seems like a far more advantageous approach (...)

The more I think about it, the more I agree with this. I feel I chose the more complicated approach in trying to get all db info from the system state in my rocknet work...

But the more we can establish up-front within the db the better I say, as long as it doesn't incure major management overheads. Such as system state that can easily fall out of sync. But in this case a Rock-on definition is up-front and known; and relatively static. And so should be a good candidate to be mirrored in the db and only updated upon re-install for example.

It sounds like a good compromise between the two approaches, indeed: establishing the database as ground truth, but implement mechanism to keep the db in sync with the system's real state that would be triggered only when absolutely required/triggered by the user. This way we can aim for high robustness at moderate cost.

Thanks a lot for all your time spent in this feedback, @phillxnet, that was immensely helpful!

@FroggyFlox Re:

I feel we might want to remove this issue and #1208 for the next milestone (this issue and #1208 would be fixed by the same PR if we go with option 2 rather than option 1)

Agreed. I've now removed this issue from the next Milestone as per:

  • attaching this issue and #1208 to the same "boat" could be considered a scope creep as it's not entirely related. We've indeed always had this issue's and #1208's problems in the code, long before the introduction of rocknets.

Ok, yes; very good point. I see that now and this makes sense. I'd initially added this issue to the milestone thinking were were almost there but I agree with the scope creep element here if we go with option 2 which seems to be the consensus now.

establishing the database as ground truth, but implement mechanism to keep the db in sync with the system's real state that would be triggered only when absolutely required

Yes, this is a major challenge through this project I'm finding. We need to be able to both instantiate a config restore state, while maintaining as much flexibility / robustness as possible. Ultimately we must reflect system state but we are not just monitoring the system but must also dictate a users expressed configuration via the Web-UI (db) settings. This compromised stance is most stretched when folks require out-of-spec configurations that our db representation just can't understand. This is likely always going to be a struggle. Anyway, lets see how we get on when we (likely you actually) take a closer look at this. We may end up having to take yet another tack. As long as we understand and can communicate the capabilities and limitations of whatever approach we take. And clearly identify the source of truth for each element, i.e. db dictates, or reflects.